From owner-freebsd-hackers Fri Feb 16 18:52:41 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.3/8.7.3) id SAA26382 for hackers-outgoing; Fri, 16 Feb 1996 18:52:41 -0800 (PST) Received: from zip.io.org (root@zip.io.org [198.133.36.80]) by freefall.freebsd.org (8.7.3/8.7.3) with SMTP id SAA26374 for ; Fri, 16 Feb 1996 18:52:35 -0800 (PST) Received: (from taob@localhost) by zip.io.org (8.6.12/8.6.12) id VAA26398; Fri, 16 Feb 1996 21:52:00 -0500 Date: Fri, 16 Feb 1996 21:52:00 -0500 (EST) From: Brian Tao To: FREEBSD-HACKERS-L Subject: Web server locks up... but not quite. (?) Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-hackers@freebsd.org Precedence: bulk This sort of thing has happened before with other 2.1.0-R machines here, but tonight was the first time I was able to get to the console of one before someone else rebooted it. Our web server is a P90 with 64 megabytes of RAM, running Apache 1.0.2. For no discernable reason, it stopped working tonight. "Stopped working" in that no TCP services were available, NFS clients that mounted a filesystem served from it hung in disk wait and no rwhod packets were being broadcast. You could telnet to various ports on it (indicating that inetd was still bound to those ports), but none of the services normally attached to those ports would run, including internal ones like chargen or daytime (indicating that inetd was blocked in some way). It wasn't fielding RPC requests either. The login prompt was still displayed on all the virtual consoles (I was still able to switch between them), but there was no response from the keyboard, as if the getty's had died off. The only sign of life was that it was returning pings from another machine. There were no telltale messages on the console, nor in the syslog. This server gets 250,000 to 300,000 hits per day. While it is running, it does not appear to be under any excessive load. There are typically 40 to 60 httpd's running. It exports a 4-gigabyte filesystem containing access logs to client machines so our customers can produce statistical reports. It also mounts 26 gigabytes of home directories from a central NFS server. Since there is no indication as to the source of the hang, is there anything I can run periodically from cron to help track down the problem? I can start tracking load averages, swap space usage, the output of vmstat, netstat, iostat and nfsstat if that will help. Any suggestions? -- Brian Tao (BT300, taob@io.org) Systems Administrator, Internex Online Inc. "Though this be madness, yet there is method in't"