From owner-freebsd-questions Thu Oct 31 05:59:51 1996 Return-Path: owner-questions Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id FAA06139 for questions-outgoing; Thu, 31 Oct 1996 05:59:51 -0800 (PST) Received: from nora.pcug.co.uk (Nora.PCUG.CO.UK [192.68.174.71]) by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id FAA06129; Thu, 31 Oct 1996 05:59:48 -0800 (PST) Received: from imdb.demon.co.uk by nora.pcug.co.uk id aa24980; 31 Oct 96 13:55 GMT Message-Id: <199610311353.NAA02231> Subject: Re: server death when swap space is all gone. To: freebsd-questions@freebsd.org Date: Thu, 31 Oct 1996 13:53:52 +0000 (GMT) Cc: gpalmer@freebsd.org In-Reply-To: <29680.846740897@orion.webspan.net> from "Gary Palmer" at Oct 31, 96 00:48:17 am From: Rob Hartill Organization: Internet Movie Database Reply-To: robh@imdb.com X-pgp-public-key: http://us.imdb.com/pgp.html X-Mailer: ELM [version 2.4 PL24 ME8a] Content-Type: text Sender: owner-questions@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Gary Palmer wrote: >Rob Hartill wrote in message ID ><199610281957.TAA10074>: >> >> A couple of time now I've seen Freebsd (2.1.0 and 2.1.5-STABLE) collapse >> into a smouldering mess after user processes consume all available swap space >> . >> >> A web server went belly up last night because of this. >> Why can't the OS recover from this ?. The memory hungry processes die >> off eventually, but instead the machine locks up and needs to be rebooted. > >I'm curious to hear this ... I often run my workstation out of memory >(too conservative on swap allocation) and NEVER have a lockup >problem. Same with the news box, which sometimes runs out of memory >for some strange reason. I had some httpd processes that grew and grew and grew. The remote machine locked up for over 6 hours (nobody there to attend to it). In that time no cron jobs ran. I had a cron job that was supposed to ping other local machines to check for problems and reboot if it lost all contact with the outside world for too long. It's a little worrying that this could be triggered by malicious users who try to grab all the swap space in an attempt to take down the machine. rob