Date: Sun, 18 Feb 1996 13:36:35 -0500 From: dennis@etinc.com (dennis) To: hackers@freebsd.org Subject: Re: Web server locks up... but not quite. (?) Message-ID: <199602181836.NAA07941@etinc.com>
next in thread | raw e-mail | index | archive | help
>> This sort of thing has happened before with other 2.1.0-R machines >> here, but tonight was the first time I was able to get to the console >> of one before someone else rebooted it. >> >> Our web server is a P90 with 64 megabytes of RAM, running Apache >> 1.0.2. For no discernable reason, it stopped working tonight. >> "Stopped working" in that no TCP services were available, NFS clients >> that mounted a filesystem served from it hung in disk wait and no >> rwhod packets were being broadcast. >> >> You could telnet to various ports on it (indicating that inetd was >> still bound to those ports), but none of the services normally >> attached to those ports would run, including internal ones like >> chargen or daytime (indicating that inetd was blocked in some way). >> It wasn't fielding RPC requests either. The login prompt was still >> displayed on all the virtual consoles (I was still able to switch >> between them), but there was no response from the keyboard, as if the >> getty's had died off. The only sign of life was that it was returning >> pings from another machine. >> >> There were no telltale messages on the console, nor in the syslog. >> This server gets 250,000 to 300,000 hits per day. While it is >> running, it does not appear to be under any excessive load. There are >> typically 40 to 60 httpd's running. It exports a 4-gigabyte >> filesystem containing access logs to client machines so our customers >> can produce statistical reports. It also mounts 26 gigabytes of home >> directories from a central NFS server. >> >> Since there is no indication as to the source of the hang, is >> there anything I can run periodically from cron to help track down the >> problem? I can start tracking load averages, swap space usage, the >> output of vmstat, netstat, iostat and nfsstat if that will help. Any >> suggestions? > >I've seen similar hangs occasionally under both 2.0.5R and 2.1.0R and one >additional "thing" I've noticed is that processes that are completely >in-core appear to keep running (i.e. I had a "vmstat 1" running for a few >weeks and when the box I am thinking of locked up, the vmstat 1 was still >scrolling output, the box was ping-able, but any services that were not >entirely in-core or required other disk accesses were not available). >There is something to the "in-core" business because I have seen the same >box both continue to broadcast rwho and NOT broadcast rwho, presumably >determined by whether or not it was in-core.. The more i read about this, the more i think its gotta be memory allocation failures...no new processes but old ones and kernel stuff keeps on ticking...is there a logging funtion for these, or would logging attempts fail as well? dennis ---------------------------------------------------------------------------- Emerging Technologies, Inc. http://www.etinc.com Synchronous PC Cards and Routers For Discriminating Tastes. 56k to T1 and beyond. Frame Relay, PPP, HDLC, and X.25 for BSD/OS, FreeBSD and LINUX.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199602181836.NAA07941>