From owner-freebsd-hackers Thu Jun 15 06:12:35 1995 Return-Path: hackers-owner Received: (from majordom@localhost) by freefall.cdrom.com (8.6.10/8.6.6) id GAA14338 for hackers-outgoing; Thu, 15 Jun 1995 06:12:35 -0700 Received: from aries.ibms.sinica.edu.tw ([140.109.40.248]) by freefall.cdrom.com (8.6.10/8.6.6) with ESMTP id GAA14317 for ; Thu, 15 Jun 1995 06:12:20 -0700 Received: (from taob@localhost) by aries.ibms.sinica.edu.tw (8.6.11/8.6.9) id VAA04453; Thu, 15 Jun 1995 21:12:13 +0800 Date: Thu, 15 Jun 1995 21:12:12 +0800 (CST) From: Brian Tao To: FREEBSD-HACKERS-L Subject: Too many open files in system Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: hackers-owner@freebsd.org Precedence: bulk Played around with running various CGI's on the NCSA httpd 1.4 server on one of my FreeBSD 2.0.5 machines this past week. Test conditions: 50 clients on the local Ethernet making random requests for HTML and CGI scripts, with a delay thrown in to simulate slow, lagged connections. Good news: the server was handling 15+ requests per second. Bad news: the machine would lock up not more than 45 minutes after the pounding began. :( John Dyson suggested it might have been an NFS-related problem, but performing the tests both with an NFS-mounted htdocs/ and a local htdocs/ directory made little difference. Everything runs fine (if not slowly) for the first little while, then all of a sudden, almost all disk activity stops. Existing processes still run (e.g., I can continue to read mail, or switch screens in iscreen) but new ones will not start (e.g., I cannot get a login prompt when telnetting in). I believe it may also involve the pager, since swapped out processes are not swapped back in (e.g., quitting the mail reader, but not getting the shell prompt back). The machine is still *running*, but practically useless since it appears the VM system has pretty much locked up. Makes it rather difficult to find more details on the problem. :( The only thing I can do is reboot. During one of the trials, syslog was going nuts logging this to disk: Jun 14 12:53:15 aries syslogd: /var/run/utmp: Too many open files in system Jun 14 12:53:15 aries last message repeated 3 times Jun 14 12:53:15 aries /kernel: file: table is full Jun 14 12:53:15 aries syslogd: /var/run/utmp: Too many open files in system Jun 14 12:53:15 aries last message repeated 3 times Jun 14 12:53:15 aries /kernel: file: table is full [...repeat 18-20 times per second...] In all cases, the common problem is "too many open files in system". The httpd error log shows CGI scripts failing for the same reason. Also, I see this in the error log: [Wed Jun 14 12:51:19 1995] httpd: could not create IPC pipe [Wed Jun 14 12:52:51 1995] socket error: accept failed The first is produced while the server can still run, and the second appears to occur after everything has died, and is repeated in the log file at a rate of 200+ per second!!! This is in pre-forking mode, if it makes any difference. My kernel is compiled with the following options: options "NMBCLUSTERS=1024" options "CHILD_MAX=128" options "OPEN_MAX=256" <-- does this help? The one time I was able to get an "fstat | wc -l" to work, it showed 1150 files open. This is with no other users logged on, and X was not running (essentially in dedicated Web server mode). -- Brian ("Though this be madness, yet there is method in't") Tao taob@gate.sinica.edu.tw <-- work ........ play --> taob@io.org