From owner-freebsd-hackers Sat Feb 17 13:01:53 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.3/8.7.3) id NAA27859 for hackers-outgoing; Sat, 17 Feb 1996 13:01:53 -0800 (PST) Received: from brasil.moneng.mei.com (brasil.moneng.mei.com [151.186.109.160]) by freefall.freebsd.org (8.7.3/8.7.3) with ESMTP id NAA27854 for ; Sat, 17 Feb 1996 13:01:50 -0800 (PST) Received: (from jgreco@localhost) by brasil.moneng.mei.com (8.7.Beta.1/8.7.Beta.1) id PAA06868; Sat, 17 Feb 1996 15:00:14 -0600 From: Joe Greco Message-Id: <199602172100.PAA06868@brasil.moneng.mei.com> Subject: Re: Web server locks up... but not quite. (?) (fwd) To: bugs@freebsd.netcom.com (Mark Hittinger) Date: Sat, 17 Feb 1996 15:00:13 -0600 (CST) Cc: hackers@FreeBSD.ORG In-Reply-To: <199602171548.JAA03588@freebsd.netcom.com> from "Mark Hittinger" at Feb 17, 96 09:48:33 am X-Mailer: ELM [version 2.4 PL24] Content-Type: text Sender: owner-hackers@FreeBSD.ORG Precedence: bulk > Is the 4 gig drive a Seagate barracuda? (yes for me, bt946c) > > Do you run alias ip's for 'virtual web sites'? (yes for me, a bunch) > > What ethernet card do you run on the box? (3c509 isa for me) > > How large is your swap file? (256mb swap file) > > The reason I ask these questions is that other boxes running the same rev > of FreeBSD will not exhibit the problem at all. I am trying to find the > common thread. For me, the box was a 486DX4/120 with IDE disk, no IP aliases, SMC Elite 8013, unknown swap file size, acting as a PPP terminal server. The frequency of occurence was extremely low (I think twice in a six month period). > > I've seen similar hangs occasionally under both 2.0.5R and 2.1.0R and one > > additional "thing" I've noticed is that processes that are completely > > in-core appear to keep running (i.e. I had a "vmstat 1" running for a few > > weeks and when the box I am thinking of locked up, the vmstat 1 was still > > scrolling output, the box was ping-able, but any services that were not > > entirely in-core or required other disk accesses were not available). > > There is something to the "in-core" business because I have seen the same > > box both continue to broadcast rwho and NOT broadcast rwho, presumably > > determined by whether or not it was in-core.. > > I saw this behavior before 2.0.5, then it went away until about 3 weeks > before 2.1R was cut. > > I will see the following kinds of processes hang (unkillable) > in "D+" state via ps. Innd, Cern httpd, and ps. > > Ps seems to have it happen a lot. "ps -ax" will hang whereas simply "ps" > will not. When "ps -ax" hangs, who and top will run ok. I wasn't lucky enough to have a runnable shell prompt. > I am wondering if the "ps -ax" hangs are because it is trying to look at > the swap space of another process which is hung and I don't realize it :-) > This would imply some kind of deadlock condition for a page out on the > swap space. It's looking like that could well be the case, it is consistent with the symptoms I had seen. ... Joe ------------------------------------------------------------------------------- Joe Greco - Systems Administrator jgreco@ns.sol.net Solaria Public Access UNIX - Milwaukee, WI 414/546-7968