From owner-freebsd-hackers Sat Feb 17 07:44:47 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.3/8.7.3) id HAA15750 for hackers-outgoing; Sat, 17 Feb 1996 07:44:47 -0800 (PST) Received: from freebsd.netcom.com (freebsd.netcom.com [198.211.79.3]) by freefall.freebsd.org (8.7.3/8.7.3) with SMTP id HAA15744 for ; Sat, 17 Feb 1996 07:44:44 -0800 (PST) Received: by freebsd.netcom.com (8.6.12/SMI-4.1) id JAA03588; Sat, 17 Feb 1996 09:48:34 -0600 From: bugs@freebsd.netcom.com (Mark Hittinger) Message-Id: <199602171548.JAA03588@freebsd.netcom.com> Subject: Re: Web server locks up... but not quite. (?) (fwd) To: hackers@freebsd.org Date: Sat, 17 Feb 1996 09:48:33 -0600 (CST) X-Mailer: ELM [version 2.4 PL25] Content-Type: text Sender: owner-hackers@freebsd.org Precedence: bulk > From: Joe Greco > To: taob@io.org (Brian Tao) > > typically 40 to 60 httpd's running. It exports a 4-gigabyte > > filesystem containing access logs to client machines so our customers > > can produce statistical reports. Is the 4 gig drive a Seagate barracuda? (yes for me, bt946c) Do you run alias ip's for 'virtual web sites'? (yes for me, a bunch) What ethernet card do you run on the box? (3c509 isa for me) How large is your swap file? (256mb swap file) The reason I ask these questions is that other boxes running the same rev of FreeBSD will not exhibit the problem at all. I am trying to find the common thread. > I've seen similar hangs occasionally under both 2.0.5R and 2.1.0R and one > additional "thing" I've noticed is that processes that are completely > in-core appear to keep running (i.e. I had a "vmstat 1" running for a few > weeks and when the box I am thinking of locked up, the vmstat 1 was still > scrolling output, the box was ping-able, but any services that were not > entirely in-core or required other disk accesses were not available). > There is something to the "in-core" business because I have seen the same > box both continue to broadcast rwho and NOT broadcast rwho, presumably > determined by whether or not it was in-core.. I saw this behavior before 2.0.5, then it went away until about 3 weeks before 2.1R was cut. I will see the following kinds of processes hang (unkillable) in "D+" state via ps. Innd, Cern httpd, and ps. Ps seems to have it happen a lot. "ps -ax" will hang whereas simply "ps" will not. When "ps -ax" hangs, who and top will run ok. I am wondering if the "ps -ax" hangs are because it is trying to look at the swap space of another process which is hung and I don't realize it :-) This would imply some kind of deadlock condition for a page out on the swap space. Regards, Mark Hittinger Netcom/Dallas bugs@freebsd.netcom.com