Date: Wed, 24 Feb 2010 18:38:03 +0200 From: Kostik Belousov <kostikbel@gmail.com> To: freebsd-stable@freebsd.org Subject: Re: sleep(3) sometimes too sleepy on FreeBSD 8.0? Message-ID: <20100224163803.GW50403@deviant.kiev.zoral.com.ua> In-Reply-To: <20100224124101.GC14464@rwpc12.mby.riverwillow.net.au> References: <20100223013522.GE2303@rwpc12.mby.riverwillow.net.au> <20100224075359.GA61876@server.vk2pj.dyndns.org> <20100224112139.GT50403@deviant.kiev.zoral.com.ua> <20100224114441.GA57760@icarus.home.lan> <20100224122045.GU50403@deviant.kiev.zoral.com.ua> <20100224124101.GC14464@rwpc12.mby.riverwillow.net.au>
next in thread | previous in thread | raw e-mail | index | archive | help
--iASP2QDfFF5MN+I5 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Feb 24, 2010 at 11:41:01PM +1100, John Marshall wrote: > On Wed, 24 Feb 2010, 14:20 +0200, Kostik Belousov wrote: > > On Wed, Feb 24, 2010 at 03:44:41AM -0800, Jeremy Chadwick wrote: > > > On Wed, Feb 24, 2010 at 01:21:39PM +0200, Kostik Belousov wrote: > > > > On Wed, Feb 24, 2010 at 06:53:59PM +1100, Peter Jeremy wrote: > > > > > Updates following some off-line discussions and debugging with Jo= hn on > > > > > IRC. I've cc'd gshapiro@ because the problem appears to be sendm= ail, > > > > > rather than the FreeBSD kernel. > > > > >=20 > > > > > On 2010-Feb-23 12:35:22 +1100, John Marshall <john.marshall@river= willow.com.au> wrote: > > > > > >Environment: sendmail 8.14.4 on FreeBSD 8.0-RELEASE-p2 > > > > >=20 > > > > > Note that this is stock ISC sendmail, not the sendmail in either = the > > > > > base system or the port. > > > > >=20 > > > > > >I posted about this in comp.mail.sendmail and was told... > > > > > > > > > > > >> sleep() should be one of these calls: > > > > > >>=20 > > > > > >> if (njobs =3D=3D 0 && WorkGrp[wgrp].wg_lowqintvl < MIN= _SLEEP_TIME) > > > > > >> sleep(MIN_SLEEP_TIME); > > > > > >> else if (WorkGrp[wgrp].wg_lowqintvl <=3D 0) > > > > > >> sleep(QueueIntvl > 0 ? QueueIntvl : MIN_SLEEP_= TIME); > > > > > >> else > > > > > >> sleep(WorkGrp[wgrp].wg_lowqintvl); > > > > >=20 > > > > > Whilst it's true that the code calls sleep(), it's not calling > > > > > sleep(3) in the FreeBSD libc. Instead it's calling a sleep() def= ined > > > > > in libsm/clock.c - which is a horrible maze of #ifdefs. > > > > >=20 > > > > > John has pre-processed that code and the result it at: > > > > > http://www.riverwillow.net.au/~john/sm/clock.preprocessed > > > > >=20 > > > > > At a quick look, the code is broken: sm_seteventm() generates a > > > > > one-off timer using setitimer(2), which will send SIGALRM when it > > > > > expires. sm_releasesignal() then unblocks SIGALRM. In theory, t= he > > > > > SIGALRM could be delivered anywhere after the (!SmSleepDone) test= and > > > > > before pause() is called - in which case, the signal is lost and > > > > > pause() will sleep forever. > > > > >=20 > > > > > On 2010-Feb-24 08:13:06 +1100, John Marshall <john.marshall@river= willow.com.au> wrote: > > > > > >My ktrace file was created with 'ktrace -g 48501'. I have the r= esult of > > > > > >'kdump -R -p 48504' available at: > > > > > > > > > > > > <http://www.riverwillow.net.au/~john/8_0/rwsrv04_201002240725.k= dump.gz> >=20 > > Regarding sigsuspend() returning EINTR without delivering any signal, > > could it be that the sendmail process was debugged ? >=20 > No. I didn't touch the process with anything this time. There was no > debugger in use on the system. That was how I found the process first > thing this morning so I sent off the kdump output. >=20 > The process stayed in the same state until I rebooted the system this > afternoon to install a kernel with debug symbols and options. I have > done the same on the other two servers, so I can dig deeper for you next > time. I am running ktrace on the sendmail process group on all three > servers waiting to catch the next one. By the way, all three are i386 > with SMP. Kernel debugging is not much needed at this stage. I would be interested if you tried latest RELENG_8 kernel, in regard the sigsuspend(2) returning with EINTR without a signal delivered. Our pause(3) as is has two problems not related to the issue you see. One is that it uses sigcompat(3) routines, bringing them into namespace when pause is used. Second, that is a consequence of first, is that realtime signals are blocked during pause(3). While testing this patch, I noted that kill(1) cannot send realtime signals to the processes. The usual race with pause() is there, it cannot be solved. diff --git a/bin/kill/kill.c b/bin/kill/kill.c index bb9982e..8ee1d85 100644 --- a/bin/kill/kill.c +++ b/bin/kill/kill.c @@ -108,7 +108,7 @@ main(int argc, char *argv[]) numsig =3D strtol(*argv, &ep, 10); if (!**argv || *ep) errx(1, "illegal signal number: %s", *argv); - if (numsig < 0 || numsig >=3D sys_nsig) + if (numsig < 0) nosig(*argv); } else nosig(*argv); diff --git a/lib/libc/gen/pause.c b/lib/libc/gen/pause.c index 00bf833..51706cf 100644 --- a/lib/libc/gen/pause.c +++ b/lib/libc/gen/pause.c @@ -33,8 +33,10 @@ static char sccsid[] =3D "@(#)pause.c 8.1 (Berkeley) 6/4= /93"; #include <sys/cdefs.h> __FBSDID("$FreeBSD$"); =20 +#include "namespace.h" #include <signal.h> #include <unistd.h> +#include "un-namespace.h" =20 /* * Backwards compatible pause. @@ -42,7 +44,11 @@ __FBSDID("$FreeBSD$"); int __pause(void) { - return sigpause(sigblock(0L)); + sigset_t oset; + + if (_sigprocmask(SIG_BLOCK, NULL, &oset) =3D=3D -1) + return (-1); + return (_sigsuspend(&oset)); } __weak_reference(__pause, pause); __weak_reference(__pause, _pause); --iASP2QDfFF5MN+I5 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (FreeBSD) iEYEARECAAYFAkuFVeoACgkQC3+MBN1Mb4hyQwCg5bvyjvb5DRs23f+qq+1KNfaa zw8An3UoqbAuQbPZ1SN4lg0KWvvgM5Q8 =25ul -----END PGP SIGNATURE----- --iASP2QDfFF5MN+I5--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100224163803.GW50403>