Date: Tue, 23 Jul 2002 19:04:42 -0700 (PDT) From: David Xu <bsddiy@yahoo.com> To: Julian Elischer <julian@elischer.org> Cc: freebsd-current@freebsd.org Subject: Re: signal handling bug in KSE MIII Message-ID: <20020724020442.88725.qmail@web20910.mail.yahoo.com> In-Reply-To: <Pine.BSF.4.21.0207221735120.5392-100000@InterJet.elischer.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Good, but I see no problem there, cursig() does not suspend thread because as I said issignal() just sets a P_STOPPED_SGNL flag, not stops. msleep() calls thread_suspend_check() with return_instead flag set, so msleep does not suspend, too condvar. it is just interrupted by any signal if PCATCH set in priority argument. thread suspending does already happen at user boundary. > The problem is that if you send a suspend signal (STOP) to a process, all > the threads must be accounted for before we can say that the > process is 'stopped'. this can be done in thread_suspend_check(), if last suspended thread finds that it is stopped by P_STOPPED_SGNL or traced flag, it can send SIGCHLD with stopped status to parent process. David Xu --- Julian Elischer <julian@elischer.org> wrote: > > > > On Sun, 21 Jul 2002, David Xu wrote: > > > > > --- David Xu <bsddiy@yahoo.com> wrote: > > > I found signal handling is still broken in CURRENT source. > > > the following program demostrates the bug is still in kernel: > > > > > > #include <stdio.h> > > > #include <signal.h> > > > > > > void handler(int sig) > > > { > > > signal(SIGTSTP, SIG_DFL); > > > kill(getpid(), SIGTSTP); > > > } > > > > > > int main() > > > { > > > char buf[64]; > > > > > > signal(SIGTSTP, handler); > > > kill(getpid(), SIGTSTP); > > > printf("input foo:"); > > > scanf("%63s", buf); > > > printf("you input:%s\n", buf); > > > return 0; > > > } > > > > > > when I press CTRL+Z, the program does not suspend and directly exits. > > > ktrace indicates that the program directly exits in kernel without > > > calling exit() from program. > > > I found SA_STOP handling is disabled in issignal(), delayed SIGTSTP > > > is forgotten by kernel, this is the reason why SIGTSTP signal handling > > > is broken. at least, one program is affected --- ftp, run ftp > > > client program, when 'ftp>' prompt appears, pressing CTRL+Z, causes ftp > > > to exit and do not suspend. patch: > > Ok, I see it.. now there is a problem when we start to consider a multi > threaded process. This is why I defined out the code (and forgot that > fact in teh 6 months following). > The problem is that if you send a suspend signal (STOP) to a process, all > the threads must be accounted for before we can say that the > process is 'stopped'. There are three basic sources of STOP conditions > being asserted, one of which is new. > 1/ Signals > 2/ Ptrace > 3/ a single-threading funnel. > > The third is new and is also where the problem is. > > A single threading funnel is needed when a process does some system calls. > for example, if it calls fork(). The thread calling fork() needs to know > that all the other threads are in a safe suspended state before it can > proceed into fork1(). To do this, it suspends itself, and sets a > STOPPED condition on the process. When the last thread is suspended, then > the initial thread is allowed to proceed, safe in teh knowledge that > no other thread is doing anything in userland that would produce an > inconsistant state for the child. (or in the kernel for > that matter). Once the fork has been completed, the STOPPED > condition is lefted and the other threads are allowed to continue. > Teh question is whether it is safe to suspend the other threads > at palces other than the user boundary.. e.g. most of the calls > to issignal() are in msleep and the condvar code. > > The picture gets more messy when we realise that SOMETIMES the funnel > doesn't want the other threads to be suspended, but rather, to EXIT. > this happens at the moment at the start of exit1(), and in exec(). > > In both of these cases, threads other than that doing the exit() or exec() > calls must be trapped in a safe place and aborted. The only safe place > to do this is at the user boundary. You have to force any such threads > that may be about to sleep, to abort and return to the user boundary. > > The thread doing the exit() can not proceed until > all other threads have been accounted for. Because these two cases are > similar, I combined them into a single facility. Thread_single() > and friends.. (it's a posibility that this was a mistake but > I'm not convinced of that yet. Basically if the STOPPED_SNGLE > flag is set in the process, any thread trying to go to userland > will drop into the funnel code. If it gets there and discovers that the > SNGLE_EXIT flag is also set it will abort rather than suspending. > > The difficulty is that issignal is catching suspensions and stopping the > threads at the point of the sleep(), in teh middle of whatever operation > it is doing. As a result of this a suspended thread cannot proceed to > the user boundary, and we are deadlocked as teh exiting process is > waiting for all the other threads to exit and the STOPPED > threads a re waiting for the STOP state to be lifted. > > My theory was that we need to allow the restarting thread > (from the msleep() to contiue on to the user boundary and stop it THERE.) > rather than stopping it in the middle of whatever calculations it was > doing, holding whatever resources it holds (vnodes etc.). I guess that if > this is the case then we need to do the extra code only in issignal > at the user boundary. Possibly the answer is to allow the code again > but to reverse the calls to issignal() and thread_suspend_check() > in msleep() below.. > > 538 sig = cursig(td); > 539 if (sig == 0) { > 540 if (thread_suspend_check(1)) { > 541 sig = SIGSTOP; > 542 } > 543 } > > so that would make it: > if (thread_suspend_check(1)) { > sig = SIGSTOP; > } else { > sig = cursig(td); > } > > then at the user boundary issignal could do the signalling of the parent > (ONLY IF THIS IS THE LAST THREAD TO SUSPEND) and the > thread_suspend_check(0) at teh user boundary could be entgrusted with > stopping the thread (puting it on the 'suspended' queue.) > > For now we can put the original code back because we don't want to break > the non-threaded case, but it will break the threaded case. > We need to go over the stop/sleep code anyway. > It needs some rewritng. > (ANyone seeing a cleaner solution is welcome to suggest it.) > > > > > > > > --- kern_sig.c.old Sun Jul 21 15:38:00 2002 > > > +++ kern_sig.c Sun Jul 21 16:31:02 2002 > > > @@ -1657,7 +1657,7 @@ > > > #endif > > > break; /* == ignore */ > > > } > > > -#if 0 > > > + > > > /* > > > * If there is a pending stop signal to process > > > * with default action, stop here, > > > @@ -1679,16 +1679,10 @@ > > > PROC_UNLOCK(p->p_pptr); > > > mtx_lock_spin(&sched_lock); > > > stop(p); > > > - PROC_UNLOCK(p); > > > - DROP_GIANT(); > > > - p->p_stats->p_ru.ru_nivcsw++; > > > - mi_switch(); > > > mtx_unlock_spin(&sched_lock); > > > - PICKUP_GIANT(); > > > - PROC_LOCK(p); > > > break; > > > } else > > > -#endif > > > + > > > if (prop & SA_IGNORE) { > > > /* > > > * Except for SIGCONT, shouldn't get here. > > > > > > David Xu > > > > > > > OOPS, the extra kill() calling in main() should be removed > > in demo program, correct source code is: > > > > #include <stdio.h> > > #include <signal.h> > > > > void handler(int sig) > > { > > signal(SIGTSTP, SIG_DFL); > > kill(getpid(), SIGTSTP); > > } > > > > int main() > > { > > char buf[64]; > > > > signal(SIGTSTP, handler); > > printf("input foo:"); > > scanf("%63s", buf); > > printf("you input:%s\n", buf); > > return 0; > > } > > > > David Xu > > > > > > > > __________________________________________________ > > Do You Yahoo!? > > Yahoo! Health - Feel better, live better > > http://health.yahoo.com > > > __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020724020442.88725.qmail>