Date: Sun, 5 May 2002 21:17:31 -0400 From: Anthony Schneider <aschneid@mail.slc.edu> To: Patrick Thomas <root@utility.clubscholarship.com> Cc: freebsd-hackers@FreeBSD.ORG Subject: Re: what causes a userland to stop, but allows kernel to continue ? Message-ID: <20020505211731.A1386@mail.slc.edu> In-Reply-To: <20020505162455.K86733-100000@utility.clubscholarship.com>; from root@utility.clubscholarship.com on Sun, May 05, 2002 at 04:31:36PM -0700 References: <20020505162455.K86733-100000@utility.clubscholarship.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--OgqxwSJOaUobr8KG Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable FWIW, I've very recently had something similar happen to a 4.5-STABLE box. The machine was NOT SMP, and the cause, as far as we know, was that /var had been filled by apache's error_log -- a funky new mod_throttle install with lots of=20 critical_acquire() failed: Permission denied critical_release() failed: Permission denied entries. Now, I assume that this is not because /var was full, but actually because of system V semaphore locking in the mod_throttle code. In mod_throttle-3.1.2... The critical_acquire() code from mod_throttle.c (assuming defined(USE_SYSTEM_V_SERIALIZATION)): <snip> struct critical { int id; struct sembuf on; struct sembuf off; }; </snip><snip> static int critical_acquire(t_critical *mp) { for (errno =3D 0; semop(mp->id, &mp->on, 1) < 0; ) { if (errno !=3D EINTR) { /*** We really should kill the server here. ***/ perror("critical_acquire() failed"); /* Neither of these calls appear to shutdown the * server and its children; exit(APEXIT_CHILDFATAL), * appears to kill only the parent process. */ ap_start_shutdown(); return -1; } } return 0; } </snip> Livelock, maybe? Is there some sort of internal kernel semaphore table whi= ch might be getting filled up or something? I'd also like to find out more ab= out this, but sadly, the machine is a remote one and I can't drop into ddb as suggested... Thanks you all very much. Hope this information is of use. -Anthony. On Sun, May 05, 2002 at 04:31:36PM -0700, Patrick Thomas wrote: >=20 > So, based on a previous thread, it looks like I have a server whose > userland halted, essentially, but the kernel continued running. >=20 > As evidenced by: >=20 > - you can still ping the server just fine > - you can still connect to running services just fine - if you ssh to it, > `ssh -v` (verbose) claims a connection is established, but the server > doesn't respond in any way over that connection. Further, you can telnet > to POP or IMAP or HTTP ports, and get a connection, but you can't get any > response. > - cron does NOT run while the server is in this state - no jobs run > - no response from the console - caps lock does NOT toggle the LED >=20 > So, as was suggested in the previous thread, it looks like my kernel is > still running, but the userland has halted. There are no log entries that > give any clue as to why this happened last week. >=20 >=20 > 1. from a theoretical standpoint, how would this happen ? > 2. Is there any way to watchdog for it and escape from it before the > userland completely crashes ? > 3. any previous/old problems that would cause this behavior ? >=20 >=20 > It is a FreeBSD 4.5-RELEASE system, and it is SMP - fairly heavily loaded > (averages 60% CPU idle in `top` output). >=20 > thanks, >=20 > PT >=20 >=20 >=20 > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-hackers" in the body of the message ----------------------------------------------- PGP key at: http://www.keyserver.net/ http://www.anthonydotcom.com/gpgkey/key.txt Home: http://www.anthonydotcom.com ----------------------------------------------- --OgqxwSJOaUobr8KG Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (FreeBSD) Comment: For info see http://www.gnupg.org iEYEARECAAYFAjzV2asACgkQ+rDjkNht5F0YaACeM1vJW/faHB3qhHUddINZMnx3 pn8AoIqn2u4B3pCmqFC9Dwi8TV84isUb =wl0Z -----END PGP SIGNATURE----- --OgqxwSJOaUobr8KG-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020505211731.A1386>