Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 18 Feb 1996 13:36:35 -0500
From:      dennis@etinc.com (dennis)
To:        hackers@freebsd.org
Subject:   Re: Web server locks up... but not quite. (?)
Message-ID:  <199602181836.NAA07941@etinc.com>

next in thread | raw e-mail | index | archive | help
>>     This sort of thing has happened before with other 2.1.0-R machines
>> here, but tonight was the first time I was able to get to the console
>> of one before someone else rebooted it.
>> 
>>     Our web server is a P90 with 64 megabytes of RAM, running Apache
>> 1.0.2.  For no discernable reason, it stopped working tonight.
>> "Stopped working" in that no TCP services were available, NFS clients
>> that mounted a filesystem served from it hung in disk wait and no
>> rwhod packets were being broadcast.
>> 
>>     You could telnet to various ports on it (indicating that inetd was
>> still bound to those ports), but none of the services normally
>> attached to those ports would run, including internal ones like
>> chargen or daytime (indicating that inetd was blocked in some way).
>> It wasn't fielding RPC requests either.  The login prompt was still
>> displayed on all the virtual consoles (I was still able to switch
>> between them), but there was no response from the keyboard, as if the
>> getty's had died off.  The only sign of life was that it was returning
>> pings from another machine.
>> 
>>     There were no telltale messages on the console, nor in the syslog.
>> This server gets 250,000 to 300,000 hits per day.  While it is
>> running, it does not appear to be under any excessive load.  There are
>> typically 40 to 60 httpd's running.  It exports a 4-gigabyte
>> filesystem containing access logs to client machines so our customers
>> can produce statistical reports.  It also mounts 26 gigabytes of home
>> directories from a central NFS server.
>> 
>>     Since there is no indication as to the source of the hang, is
>> there anything I can run periodically from cron to help track down the
>> problem?  I can start tracking load averages, swap space usage, the
>> output of vmstat, netstat, iostat and nfsstat if that will help.  Any
>> suggestions?
>
>I've seen similar hangs occasionally under both 2.0.5R and 2.1.0R and one
>additional "thing" I've noticed is that processes that are completely
>in-core appear to keep running (i.e. I had a "vmstat 1" running for a few
>weeks and when the box I am thinking of locked up, the vmstat 1 was still
>scrolling output, the box was ping-able, but any services that were not
>entirely in-core or required other disk accesses were not available).
>There is something to the "in-core" business because I have seen the same
>box both continue to broadcast rwho and NOT broadcast rwho, presumably
>determined by whether or not it was in-core..

The more i read about this, the more i think its gotta be memory
allocation failures...no new processes but old ones and kernel
stuff keeps on ticking...is there a logging funtion for these, or 
would logging attempts fail as well?

dennis
----------------------------------------------------------------------------
Emerging Technologies, Inc.      http://www.etinc.com

Synchronous PC Cards and Routers For Discriminating
Tastes. 56k to T1 and beyond. Frame Relay, PPP, HDLC, 
and X.25 for BSD/OS, FreeBSD and LINUX.





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199602181836.NAA07941>