Date: Fri, 22 Nov 2013 11:56:03 +0800 From: David Xu <davidxu@freebsd.org> To: Konstantin Belousov <kostikbel@gmail.com> Cc: freebsd-hackers@freebsd.org, Vitaly Magerya <vmagerya@gmail.com>, threads@freebsd.org Subject: Re: Problem with signal 0 being delivered to SIGUSR1 handler Message-ID: <528ED5D3.1030906@freebsd.org> In-Reply-To: <20131121211546.GQ59496@kib.kiev.ua> References: <528DFEE6.6020504@gmail.com> <20131121211546.GQ59496@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2013/11/22 05:15, Konstantin Belousov wrote: > On Thu, Nov 21, 2013 at 02:39:02PM +0200, Vitaly Magerya wrote: >> Hi, folks. I'm investigating a test case failure that devel/boehm-gc >> has on recent FreeBSD releases. The problem is that a signal >> handler registered for SIGUSR1 is sometimes called with signum=0, >> which should not be possible under any conditions. >> >> Here's a simple test case that demonstrates this behavior: >> >> /* Compile with 'c99 -o example example.c -pthread' >> */ >> #include <pthread.h> >> #include <signal.h> >> #include <stdio.h> >> #include <stdlib.h> >> >> void signal_handler(int signum, siginfo_t *si, void *context) { >> if (signum != SIGUSR1) { >> printf("bad signal, signum=%d\n", signum); >> exit(1); >> } >> } >> >> void *thread_func(void *arg) { >> return arg; >> } >> >> int main(void) { >> struct sigaction sa = { 0 }; >> sa.sa_flags = SA_SIGINFO; >> sa.sa_sigaction = signal_handler; >> if (sigfillset(&sa.sa_mask) != 0) abort(); >> if (sigaction(SIGUSR1, &sa, NULL) != 0) abort(); >> for (int i = 0; i < 10000; i++) { >> pthread_t t; >> pthread_create(&t, NULL, thread_func, NULL); >> pthread_kill(t, SIGUSR1); > Side note. pthread_kill(3) call behaviour is undefined if pthread_create(3) > in the line before failed. > >> } >> return 0; >> } >> >> Under FreeBSD 9.2-RELEASE amd64 I pretty consistently get >> "signum=0" from this program, but you may need to run it a few >> times or increase the number of iterations to see the same. >> >> Interestingly enough, I don't see this behavior under 9.0-RELEASE. >> >> So, any ideas what the problem here is? > > It happens when libthr deferred signal handling path is taken for signal > delivery and for some reason the code inside the deferred path called > into rtld for symbol binding. Than, rtld lock is locked, some code in > rtld is executed, and rtld lock is unlocked. Unlock causes _thr_ast() > run, which results in the nested check_deferred_signal() execution. > The check_deferred_signal() clearks si_signo, so on return the same > signal is delivered one more time, but is advertized as signo zero. > > The _thr_rtld_init() approach of doing dummy calls does not really work, > since it is not practically possible to enumerate the symbols needed > during signal delivery. > > My first attempt to fix this was to increment curthread->critical_count > around the calls to check_* functions in the _thr_ast(), but it causes > reverse problem of losing _thr_ast() runs on unlock. > > I ended up with the flag to indicate that deferred delivery is running, > so check_deferred_signal() should avoid doing anything. A delicate > moment is that user signal handler is allowed to modify the passed > machine context to result the return from the signal handler to cause > arbitrary jump, or just do longjmp(). For this case, I also clear the > flag in thr_sighandler(), since kernel signal delivery means that nested > delivery code should not run right now. > > Please try this. > > diff --git a/lib/libthr/thread/thr_private.h b/lib/libthr/thread/thr_private.h > index 83a02b5..c6651cd 100644 > --- a/lib/libthr/thread/thr_private.h > +++ b/lib/libthr/thread/thr_private.h > @@ -433,6 +433,9 @@ struct pthread { > /* the sigaction should be used for deferred signal. */ > struct sigaction deferred_sigact; > > + /* deferred signal delivery is performed, do not reenter. */ > + int deferred_run; > + > /* Force new thread to exit. */ > int force_exit; > > diff --git a/lib/libthr/thread/thr_sig.c b/lib/libthr/thread/thr_sig.c > index 415ddb0..57c9406 100644 > --- a/lib/libthr/thread/thr_sig.c > +++ b/lib/libthr/thread/thr_sig.c > @@ -162,6 +162,7 @@ thr_sighandler(int sig, siginfo_t *info, void *_ucp) > act = _thr_sigact[sig-1].sigact; > _thr_rwl_unlock(&_thr_sigact[sig-1].lock); > errno = err; > + curthread->deferred_run = 0; > > /* > * if a thread is in critical region, for example it holds low level locks, > @@ -320,14 +321,18 @@ check_deferred_signal(struct pthread *curthread) > siginfo_t info; > int uc_len; > > - if (__predict_true(curthread->deferred_siginfo.si_signo == 0)) > + if (__predict_true(curthread->deferred_siginfo.si_signo == 0 || > + curthread->deferred_run)) > return; > > + curthread->deferred_run = 1; > uc_len = __getcontextx_size(); > uc = alloca(uc_len); > getcontext(uc); > - if (curthread->deferred_siginfo.si_signo == 0) > + if (curthread->deferred_siginfo.si_signo == 0) { > + curthread->deferred_run = 0; > return; > + } > __fillcontextx2((char *)uc); > act = curthread->deferred_sigact; > uc->uc_sigmask = curthread->deferred_sigmask; > The patch looks fine to me.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?528ED5D3.1030906>