From owner-freebsd-current Sun Nov 8 13:26:24 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id NAA23583 for freebsd-current-outgoing; Sun, 8 Nov 1998 13:26:24 -0800 (PST) (envelope-from owner-freebsd-current@FreeBSD.ORG) Received: from fallout.campusview.indiana.edu (fallout.campusview.indiana.edu [149.159.1.1]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id NAA23577 for ; Sun, 8 Nov 1998 13:26:23 -0800 (PST) (envelope-from jfieber@fallout.campusview.indiana.edu) Received: from localhost (jfieber@localhost) by fallout.campusview.indiana.edu (8.9.1/8.9.1) with ESMTP id QAA18801; Sun, 8 Nov 1998 16:25:53 -0500 (EST) Date: Sun, 8 Nov 1998 16:25:53 -0500 (EST) From: John Fieber To: Eivind Eklund cc: current@FreeBSD.ORG Subject: Re: The infamous dying daemons bug In-Reply-To: <19981108160934.30826@follo.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Sun, 8 Nov 1998, Eivind Eklund wrote: > On Sun, Nov 08, 1998 at 09:22:50AM -0500, John Fieber wrote: > > One question: Is the problem "sticky"? By that I mean, if it is > > triggered by a memomry shortage, is something in the kernel > > corrupted that tends to kill/corrupt daemons from that point in > > time on, or is it just something that affects isolated processes. > > All daemons running at that point seems to get something corrupted. > If you restart the daemon, it won't happen again until you again run > out of memory (or whatever it is that trigger the corruption). I've just been re-examining log files. What I see is that problems always follow this message which never occurs more than once during any give time the system is up: /kernel: swap_pager: suggest more swap space: 125 MB It is always 125 MB...I'm still not completely clear on what that number is, but anyway... Here are some highlights from one particular system run where inetd and httpd die. I've omitted redundant "signal 11" lines since once the process is corrupted, any connection attempt generates a slew of them. Nov 3 16:53:44 fallout /kernel: FreeBSD 3.0-CURRENT #17: Tue Nov 3 16:46:57 EST 1998 Nov 3 17:33:58 fallout /kernel: swap_pager: suggest more swap space: 125 MB Nov 5 03:09:22 fallout /kernel: pid 15615 (inetd), uid 0: exited on signal 11 ...I kill and restart inetd at some point in this interval... Nov 5 09:42:25 fallout /kernel: pid 16904 (inetd), uid 0: exited on signal 11 ...And again... Nov 5 13:36:34 fallout /kernel: pid 17779 (inetd), uid 0: exited on signal 11 ...And again, this time inetd has the "junk pointer" patchs from PR 8183 applied... Nov 6 00:52:19 fallout /kernel: pid 19759 (httpd), uid 65534: exited on signal 11 Nov 6 03:14:47 fallout /kernel: pid 20245 (inetd), uid 0: exited on signal 11 ...and I reboot in the morning... There are no "swap_pager: out of swap" message anywhere in the logs which go back to just before I switched from 2.2.7 to 3.0-BETA. Any memory shortages after the first "suggest more swap" message are not being logged if they occur. Since this sample I've bumped swap from 128MB to 256MB and have not had any problems yet. Another curiosity, I'm getting some curiously garbled lines in the log files: Oct 25 09:11:00 fallout /kernel: pid 29392 (inetd Oct 25 09:10:49 fallout inetd[180]: /usr/local/libexec/amanda/amandad[28958]: exit status 0xb Oct 25 09:11:00 fallout /kernel: ), uid 0: exited on signal 11 -john To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message