Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 5 Feb 2001 18:31:21 -0800 (PST)
From:      Matt Dillon <dillon@earth.backplane.com>
To:        Dan Phoenix <dphoenix@bravenet.com>
Cc:        Alfred Perlstein <bright@wintelcom.net>, Jos Backus <josb@cncdsl.com>, freebsd-hackers@FreeBSD.ORG
Subject:   Re: qmail IO problems
Message-ID:  <200102060231.f162VL557466@earth.backplane.com>
References:   <Pine.BSO.4.21.0102051746450.18264-200000@gandalf.bravenet.com>

next in thread | previous in thread | raw e-mail | index | archive | help
:ok of those commands some interesting info was from dmesg...
:on one machine i had 
:file: table is full
:file: table is full
:file: table is full
:file: table is full
:file: table is full
:file: table is full
:file: table is full
:file: table is full
:file: table is full
:file: table is full
:file: table is full
:file: table is full
:file: table is full
:
:from dmesg
:
:on the other machine
:looutput: mbuf allocation failed
:nfs server 172.16.0.101:/bravenet1/home: not responding
:looutput: mbuf allocation failed
:looutput: mbuf allocation failed
:looutput: mbuf allocation failed
:nfs server 172.16.0.101:/bravenet1/home: is alive again
:looutput: mbuf allocation failed
:looutput: mbuf allocation failed
:
:i doubt that  mbuf allocation failed was from the nfs server timeout that
:one time....but cannot be certain......this help you at all?

    This sheds a considerable amount of light on the problems...
    methinks you may have a low 'maxusers' setting in the kernel
    config.  Read on.

    I still need the complete 'dmesg' output, or if it all scrolled off
    due to the above errors, cat the '/var/run/dmesg.boot' file.

    You had systat -vm 1 output in the earlier emails, but not
    'vmstat 1' output for 20 seconds.  That isn't as big a deal with
    all the other info we have now, but still useful.


:[root@arwen qmail-1.03]# pstat -s
:Device          1K-blocks     Used    Avail Capacity  Type
:/dev/ad0s1b       1048448        0  1048448     0%    Interleaved
:[root@arwen qmail-1.03]# 
:
:[root@elrond dphoenix]# pstat -s
:Device          1K-blocks     Used    Avail Capacity  Type
:/dev/ad0s1b        528696     2032   526664     0%    Interleaved
:[root@elrond dphoenix]# 

    This indicates that you are not swapping or paging significantly,
    which is good.  We can cross that off the list of possible problems.

:ps axlww
:
:included is ps.txt.....
:2 perl scripts running only on that machine at moment yet qmail queue keep
:getting larger....seems to be getting abit better but not that great
:either.
:
: (ps output not included in reply)

    The ps output indicates that you are running a relatively light process
    load.  The prime suspects are thus the file table and mbuf errors.

    These errors normally occur when you configure a much too low 
    'maxusers' setting in the kernel config.  Since you didn't provide
    the complete dmesg output (cat /var/run/dmesg.boot), I can't tell
    but I am guessing that you are either using the GENERIC kernel
    directly, or you created a custom kernel but didn't tune the
    'maxusers' entry.

    For a machine doing the work this machine is doing, I recommend
    a maxusers setting in the kernel config of 256.  You need to rebuild
    your kernel in that case.  Have you ever built a kernel before?
    I think all you may need to do is up 'maxusers' in the kernel
    config and perhaps mess around with the number of mbuf clusters,
    but I suspect increasing maxusers will do the trick.  These
    changes require recompiling the kernel.

    Also, to make sure... you haven't tweaked any other sysctl's, have
    you?

						-Matt



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200102060231.f162VL557466>