From owner-freebsd-threads@FreeBSD.ORG Thu Nov 21 21:15:55 2013 Return-Path: Delivered-To: threads@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6C5BC94B; Thu, 21 Nov 2013 21:15:55 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 095A726D1; Thu, 21 Nov 2013 21:15:54 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.7/8.14.7) with ESMTP id rALLFkpI074541; Thu, 21 Nov 2013 23:15:46 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.8.3 kib.kiev.ua rALLFkpI074541 Received: (from kostik@localhost) by tom.home (8.14.7/8.14.7/Submit) id rALLFkvl074540; Thu, 21 Nov 2013 23:15:46 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 21 Nov 2013 23:15:46 +0200 From: Konstantin Belousov To: Vitaly Magerya Subject: Re: Problem with signal 0 being delivered to SIGUSR1 handler Message-ID: <20131121211546.GQ59496@kib.kiev.ua> References: <528DFEE6.6020504@gmail.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="JeIqLcbgB5JjL5AU" Content-Disposition: inline In-Reply-To: <528DFEE6.6020504@gmail.com> User-Agent: Mutt/1.5.22 (2013-10-16) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: freebsd-hackers@freebsd.org, davidxu@freebsd.org, threads@freebsd.org X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Nov 2013 21:15:55 -0000 --JeIqLcbgB5JjL5AU Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Nov 21, 2013 at 02:39:02PM +0200, Vitaly Magerya wrote: > Hi, folks. I'm investigating a test case failure that devel/boehm-gc > has on recent FreeBSD releases. The problem is that a signal > handler registered for SIGUSR1 is sometimes called with signum=3D0, > which should not be possible under any conditions. >=20 > Here's a simple test case that demonstrates this behavior: >=20 > /* Compile with 'c99 -o example example.c -pthread' > */ > #include > #include > #include > #include >=20 > void signal_handler(int signum, siginfo_t *si, void *context) { > if (signum !=3D SIGUSR1) { > printf("bad signal, signum=3D%d\n", signum); > exit(1); > } > } >=20 > void *thread_func(void *arg) { > return arg; > } >=20 > int main(void) { > struct sigaction sa =3D { 0 }; > sa.sa_flags =3D SA_SIGINFO; > sa.sa_sigaction =3D signal_handler; > if (sigfillset(&sa.sa_mask) !=3D 0) abort(); > if (sigaction(SIGUSR1, &sa, NULL) !=3D 0) abort(); > for (int i =3D 0; i < 10000; i++) { > pthread_t t; > pthread_create(&t, NULL, thread_func, NULL); > pthread_kill(t, SIGUSR1); Side note. pthread_kill(3) call behaviour is undefined if pthread_create(3) in the line before failed. > } > return 0; > } >=20 > Under FreeBSD 9.2-RELEASE amd64 I pretty consistently get > "signum=3D0" from this program, but you may need to run it a few > times or increase the number of iterations to see the same. >=20 > Interestingly enough, I don't see this behavior under 9.0-RELEASE. >=20 > So, any ideas what the problem here is? It happens when libthr deferred signal handling path is taken for signal delivery and for some reason the code inside the deferred path called into rtld for symbol binding. Than, rtld lock is locked, some code in rtld is executed, and rtld lock is unlocked. Unlock causes _thr_ast() run, which results in the nested check_deferred_signal() execution. The check_deferred_signal() clearks si_signo, so on return the same signal is delivered one more time, but is advertized as signo zero. The _thr_rtld_init() approach of doing dummy calls does not really work, since it is not practically possible to enumerate the symbols needed during signal delivery. My first attempt to fix this was to increment curthread->critical_count around the calls to check_* functions in the _thr_ast(), but it causes reverse problem of losing _thr_ast() runs on unlock. I ended up with the flag to indicate that deferred delivery is running, so check_deferred_signal() should avoid doing anything. A delicate moment is that user signal handler is allowed to modify the passed machine context to result the return from the signal handler to cause arbitrary jump, or just do longjmp(). For this case, I also clear the flag in thr_sighandler(), since kernel signal delivery means that nested delivery code should not run right now. Please try this. diff --git a/lib/libthr/thread/thr_private.h b/lib/libthr/thread/thr_privat= e.h index 83a02b5..c6651cd 100644 --- a/lib/libthr/thread/thr_private.h +++ b/lib/libthr/thread/thr_private.h @@ -433,6 +433,9 @@ struct pthread { /* the sigaction should be used for deferred signal. */ struct sigaction deferred_sigact; =20 + /* deferred signal delivery is performed, do not reenter. */ + int deferred_run; + /* Force new thread to exit. */ int force_exit; =20 diff --git a/lib/libthr/thread/thr_sig.c b/lib/libthr/thread/thr_sig.c index 415ddb0..57c9406 100644 --- a/lib/libthr/thread/thr_sig.c +++ b/lib/libthr/thread/thr_sig.c @@ -162,6 +162,7 @@ thr_sighandler(int sig, siginfo_t *info, void *_ucp) act =3D _thr_sigact[sig-1].sigact; _thr_rwl_unlock(&_thr_sigact[sig-1].lock); errno =3D err; + curthread->deferred_run =3D 0; =20 /* * if a thread is in critical region, for example it holds low level lock= s, @@ -320,14 +321,18 @@ check_deferred_signal(struct pthread *curthread) siginfo_t info; int uc_len; =20 - if (__predict_true(curthread->deferred_siginfo.si_signo =3D=3D 0)) + if (__predict_true(curthread->deferred_siginfo.si_signo =3D=3D 0 || + curthread->deferred_run)) return; =20 + curthread->deferred_run =3D 1; uc_len =3D __getcontextx_size(); uc =3D alloca(uc_len); getcontext(uc); - if (curthread->deferred_siginfo.si_signo =3D=3D 0) + if (curthread->deferred_siginfo.si_signo =3D=3D 0) { + curthread->deferred_run =3D 0; return; + } __fillcontextx2((char *)uc); act =3D curthread->deferred_sigact; uc->uc_sigmask =3D curthread->deferred_sigmask; --JeIqLcbgB5JjL5AU Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (FreeBSD) iQIcBAEBAgAGBQJSjngBAAoJEJDCuSvBvK1BKx0QAJjAmJSh9i2IQC6e8pF1QXJG P6lTmX3WpLVdnPAA5ord/KoiBCaNJQ4w2YaEWOzuP3o4GHX70dYLY9HWHuwgMhei NzS+xOCdzZcPDI68ZghJ/N/67oJSlC9i/N4RLdgDqaBpElYrOKk1pmXqpQ/216op XinMrpR5oR4TvXJ80dNCsGzc5xQ0J9LW5TjYf3rzHSJSaYWO6jSIUwDrb6kLxtVA 7enT9j8rMO+HbXgWNNcXMBTAfo+2PabK/33twemiX7dbzGTQapbVK6RU9MYBYO0N 2Sa6YI0Zd5SFJyXLLggPi/Qop/mGIrsCgd2ICOsGnBYtc5qGpeFZkbKB8OnRdw02 u4HWokfnaE6eH+ktipA9+nbpAGL3MCsHgSZBLoIKDX0YWmqvEMM6wHdrJWWwIfEB /YJp8iHGwbrjtXx4ddUqa/30BRU1HzDImPafbAOvVdjLKFQozpHPJFwRhX+2NEA/ TA7PlXXLDVXc4wE7eP0Lo/8Vpnhk/Wv5Xz2a97F6IzdeOZpbuQwLaFf5eOJD77z9 8J1hhwE//c7nlk+9ovvRvqOdXyGeQSZaW22BRNu4VjYW/Cs5uaSGBCfPKJe99DGx 4tl3vaP28nnhQRH3reqyE/fJtfaJkMrGccO2EYVbkibaLWMEMBmLQ57no4TXrdWS BU5IUgDGkfqq4DKVpL87 =jCRC -----END PGP SIGNATURE----- --JeIqLcbgB5JjL5AU--