Date: Mon, 5 Feb 2001 18:31:21 -0800 (PST) From: Matt Dillon <dillon@earth.backplane.com> To: Dan Phoenix <dphoenix@bravenet.com> Cc: Alfred Perlstein <bright@wintelcom.net>, Jos Backus <josb@cncdsl.com>, freebsd-hackers@FreeBSD.ORG Subject: Re: qmail IO problems Message-ID: <200102060231.f162VL557466@earth.backplane.com> References: <Pine.BSO.4.21.0102051746450.18264-200000@gandalf.bravenet.com>
next in thread | previous in thread | raw e-mail | index | archive | help
:ok of those commands some interesting info was from dmesg...
:on one machine i had
:file: table is full
:file: table is full
:file: table is full
:file: table is full
:file: table is full
:file: table is full
:file: table is full
:file: table is full
:file: table is full
:file: table is full
:file: table is full
:file: table is full
:file: table is full
:
:from dmesg
:
:on the other machine
:looutput: mbuf allocation failed
:nfs server 172.16.0.101:/bravenet1/home: not responding
:looutput: mbuf allocation failed
:looutput: mbuf allocation failed
:looutput: mbuf allocation failed
:nfs server 172.16.0.101:/bravenet1/home: is alive again
:looutput: mbuf allocation failed
:looutput: mbuf allocation failed
:
:i doubt that mbuf allocation failed was from the nfs server timeout that
:one time....but cannot be certain......this help you at all?
This sheds a considerable amount of light on the problems...
methinks you may have a low 'maxusers' setting in the kernel
config. Read on.
I still need the complete 'dmesg' output, or if it all scrolled off
due to the above errors, cat the '/var/run/dmesg.boot' file.
You had systat -vm 1 output in the earlier emails, but not
'vmstat 1' output for 20 seconds. That isn't as big a deal with
all the other info we have now, but still useful.
:[root@arwen qmail-1.03]# pstat -s
:Device 1K-blocks Used Avail Capacity Type
:/dev/ad0s1b 1048448 0 1048448 0% Interleaved
:[root@arwen qmail-1.03]#
:
:[root@elrond dphoenix]# pstat -s
:Device 1K-blocks Used Avail Capacity Type
:/dev/ad0s1b 528696 2032 526664 0% Interleaved
:[root@elrond dphoenix]#
This indicates that you are not swapping or paging significantly,
which is good. We can cross that off the list of possible problems.
:ps axlww
:
:included is ps.txt.....
:2 perl scripts running only on that machine at moment yet qmail queue keep
:getting larger....seems to be getting abit better but not that great
:either.
:
: (ps output not included in reply)
The ps output indicates that you are running a relatively light process
load. The prime suspects are thus the file table and mbuf errors.
These errors normally occur when you configure a much too low
'maxusers' setting in the kernel config. Since you didn't provide
the complete dmesg output (cat /var/run/dmesg.boot), I can't tell
but I am guessing that you are either using the GENERIC kernel
directly, or you created a custom kernel but didn't tune the
'maxusers' entry.
For a machine doing the work this machine is doing, I recommend
a maxusers setting in the kernel config of 256. You need to rebuild
your kernel in that case. Have you ever built a kernel before?
I think all you may need to do is up 'maxusers' in the kernel
config and perhaps mess around with the number of mbuf clusters,
but I suspect increasing maxusers will do the trick. These
changes require recompiling the kernel.
Also, to make sure... you haven't tweaked any other sysctl's, have
you?
-Matt
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200102060231.f162VL557466>
