From owner-freebsd-hackers Mon Feb 5 18:31:40 2001 Delivered-To: freebsd-hackers@freebsd.org Received: from earth.backplane.com (earth-nat-cw.backplane.com [208.161.114.67]) by hub.freebsd.org (Postfix) with ESMTP id 50C6437B67D for ; Mon, 5 Feb 2001 18:31:22 -0800 (PST) Received: (from dillon@localhost) by earth.backplane.com (8.11.1/8.9.3) id f162VL557466; Mon, 5 Feb 2001 18:31:21 -0800 (PST) (envelope-from dillon) Date: Mon, 5 Feb 2001 18:31:21 -0800 (PST) From: Matt Dillon Message-Id: <200102060231.f162VL557466@earth.backplane.com> To: Dan Phoenix Cc: Alfred Perlstein , Jos Backus , freebsd-hackers@FreeBSD.ORG Subject: Re: qmail IO problems References: Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG :ok of those commands some interesting info was from dmesg... :on one machine i had :file: table is full :file: table is full :file: table is full :file: table is full :file: table is full :file: table is full :file: table is full :file: table is full :file: table is full :file: table is full :file: table is full :file: table is full :file: table is full : :from dmesg : :on the other machine :looutput: mbuf allocation failed :nfs server 172.16.0.101:/bravenet1/home: not responding :looutput: mbuf allocation failed :looutput: mbuf allocation failed :looutput: mbuf allocation failed :nfs server 172.16.0.101:/bravenet1/home: is alive again :looutput: mbuf allocation failed :looutput: mbuf allocation failed : :i doubt that mbuf allocation failed was from the nfs server timeout that :one time....but cannot be certain......this help you at all? This sheds a considerable amount of light on the problems... methinks you may have a low 'maxusers' setting in the kernel config. Read on. I still need the complete 'dmesg' output, or if it all scrolled off due to the above errors, cat the '/var/run/dmesg.boot' file. You had systat -vm 1 output in the earlier emails, but not 'vmstat 1' output for 20 seconds. That isn't as big a deal with all the other info we have now, but still useful. :[root@arwen qmail-1.03]# pstat -s :Device 1K-blocks Used Avail Capacity Type :/dev/ad0s1b 1048448 0 1048448 0% Interleaved :[root@arwen qmail-1.03]# : :[root@elrond dphoenix]# pstat -s :Device 1K-blocks Used Avail Capacity Type :/dev/ad0s1b 528696 2032 526664 0% Interleaved :[root@elrond dphoenix]# This indicates that you are not swapping or paging significantly, which is good. We can cross that off the list of possible problems. :ps axlww : :included is ps.txt..... :2 perl scripts running only on that machine at moment yet qmail queue keep :getting larger....seems to be getting abit better but not that great :either. : : (ps output not included in reply) The ps output indicates that you are running a relatively light process load. The prime suspects are thus the file table and mbuf errors. These errors normally occur when you configure a much too low 'maxusers' setting in the kernel config. Since you didn't provide the complete dmesg output (cat /var/run/dmesg.boot), I can't tell but I am guessing that you are either using the GENERIC kernel directly, or you created a custom kernel but didn't tune the 'maxusers' entry. For a machine doing the work this machine is doing, I recommend a maxusers setting in the kernel config of 256. You need to rebuild your kernel in that case. Have you ever built a kernel before? I think all you may need to do is up 'maxusers' in the kernel config and perhaps mess around with the number of mbuf clusters, but I suspect increasing maxusers will do the trick. These changes require recompiling the kernel. Also, to make sure... you haven't tweaked any other sysctl's, have you? -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message