From owner-freebsd-stable Mon Jun 10 11: 8: 7 2002 Delivered-To: freebsd-stable@freebsd.org Received: from mail.westbend.net (ns1.westbend.net [216.47.253.3]) by hub.freebsd.org (Postfix) with ESMTP id 9312237B40A for ; Mon, 10 Jun 2002 11:04:57 -0700 (PDT) Received: from ADMIN00 (bnet.westbend.net [216.47.253.17]) by mail.westbend.net (8.12.3/8.12.3) with SMTP id g5AI4nGM067497 for ; Mon, 10 Jun 2002 13:04:49 -0500 (CDT) (envelope-from hetzels@westbend.net) Message-ID: <000501c210a9$05e534e0$11fd2fd8@ADMIN00> From: "Scot W. Hetzel" To: "FreeBSD-Stable" Subject: Run away MBUFS Date: Mon, 10 Jun 2002 13:02:45 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 X-Virus-Scanned: by amavisd-milter (http://amavis.org/) Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Friday (6/7), we started experiencing a problem where the mbufs are continually increasing, until it hits the max, and then the system will lock up. This same kernel was working fine until last week Friday. The kernel's config file has maxuser=0, so that it will use the autosize feature. I have tried increasing the nmbclusters, using the sysctl 'kern.ipc.nmbclusters'. It was set to 16384 and the system stayed up a little while longer, but it still crashed. We now have it set to 60000, and while it hasn't crashed in the past 12 hours, we are still showing that the mbufs are still being used up. Sun Jun 9 23:39:46 CDT 2002 324/400/240000 mbufs in use (current/peak/max): 324 mbufs allocated to data 170/218/60000 mbuf clusters in use (current/peak/max) 536 Kbytes allocated to network (0% of mb_map in use) 0 requests for memory denied 0 requests for memory delayed 0 calls to protocol drain routines Mon Jun 10 00:00:00 CDT 2002 5041/5184/240000 mbufs in use (current/peak/max): 5041 mbufs allocated to data 217/300/60000 mbuf clusters in use (current/peak/max) 1896 Kbytes allocated to network (1% of mb_map in use) 0 requests for memory denied 0 requests for memory delayed 0 calls to protocol drain routines Mon Jun 10 12:00:00 CDT 2002 60155/60208/240000 mbufs in use (current/peak/max): 60151 mbufs allocated to data 4 mbufs allocated to packet headers 258/314/60000 mbuf clusters in use (current/peak/max) 15680 Kbytes allocated to network (8% of mb_map in use) 0 requests for memory denied 0 requests for memory delayed 0 calls to protocol drain routines Mon Jun 10 13:00:00 CDT 2002 68616/68688/240000 mbufs in use (current/peak/max): 68616 mbufs allocated to data 241/314/60000 mbuf clusters in use (current/peak/max) 17800 Kbytes allocated to network (9% of mb_map in use) 0 requests for memory denied 0 requests for memory delayed 0 calls to protocol drain routines How can we track down what is using up these mbufs? ns0# ps -ax PID TT STAT TIME COMMAND 0 ?? DLs 0:00.00 (swapper) 1 ?? ILs 0:00.02 /sbin/init -- 2 ?? DL 0:00.08 (pagedaemon) 3 ?? DL 0:00.00 (vmdaemon) 4 ?? DL 0:00.37 (bufdaemon) 5 ?? DL 0:03.03 (syncer) 6 ?? DL 0:00.31 (vnlru) 23 ?? Is 0:00.00 adjkerntz -i 91 ?? Ss 0:02.42 /usr/sbin/syslogd -s 94 ?? Ss 2:05.65 /usr/sbin/named 96 ?? Ss 0:05.44 /usr/sbin/ntpd -p /var/run/ntpd.pid -c /etc/ntp.conf 98 ?? Is 0:00.00 /usr/sbin/portmap 103 ?? I 0:00.00 nfsiod -n 4 104 ?? I 0:00.00 nfsiod -n 4 105 ?? I 0:00.00 nfsiod -n 4 106 ?? I 0:00.00 nfsiod -n 4 110 ?? Is 0:00.36 rwhod 116 ?? Is 0:00.00 /usr/sbin/inetd -wW 118 ?? Is 0:00.41 /usr/sbin/cron 120 ?? Is 0:00.41 /usr/sbin/sshd 123 ?? Ss 0:04.22 sendmail: accepting connections (sendmail) 126 ?? Is 0:00.08 sendmail: Queue runner@00:30:00 for /var/spool/clientmqueue (sendmail) 848 ?? I 0:00.06 sendmail: server [202.120.80.1] cmd read (sendmail) 849 ?? S 0:00.44 sshd: admin@ttyp0 (sshd) 850 ?? I 0:00.07 sendmail: server [202.120.80.1] cmd read (sendmail) 851 p0 Is 0:00.13 -csh (csh) 855 p0 S 0:00.16 -su (csh) 865 p0 R+ 0:00.00 ps -ax 180 v0 Is 0:00.08 login -p root 700 v0 I+ 0:00.14 -csh (csh) 181 v1 Is+ 0:00.02 /usr/libexec/getty Pc ttyv1 182 v2 Is+ 0:00.02 /usr/libexec/getty Pc ttyv2 183 v3 Is+ 0:00.02 /usr/libexec/getty Pc ttyv3 184 v4 Is+ 0:00.02 /usr/libexec/getty Pc ttyv4 185 v5 Is+ 0:00.02 /usr/libexec/getty Pc ttyv5 186 v6 Is+ 0:00.02 /usr/libexec/getty Pc ttyv6 187 v7 Is+ 0:00.02 /usr/libexec/getty Pc ttyv7 179 con- S 0:12.18 /usr/local/sbin/amavis-milter -D -p local:/var/amavis/amavis-milter.sock This system is acting as a router, using an fxp0 ethernet card and an ET Inc ET/5025PQ QUAD Adapter(ET/HDLC Driver v3.21i). We have also tried upgrading the kernel to 4.6-RC w/ET/HDLC Driver v3.21k, but still have the same mbuf problem (NOTE: world was built, but it wasn't installed). We have since downgraded the kernel back to the original 4.5-Stable kernel (4.5-STABLE #8: Wed Apr 24 12:29:46 CDT 2002), and increased the 'kern.ipc.nmbclusters' value. We are running a GENERIC kernel, with the following changes: ns0# cvs diff GENERIC Index: GENERIC =================================================================== RCS file: /home/ncvs/src/sys/i386/conf/GENERIC,v retrieving revision 1.246.2.43 diff -r1.246.2.43 GENERIC 58a59,75 > options IPFIREWALL #firewall > options IPFIREWALL_VERBOSE #enable logging to syslogd(8) > options IPFIREWALL_FORWARD #enable transparent proxy support > options IPFIREWALL_VERBOSE_LIMIT=100 #limit verbosity > #options IPFIREWALL_DEFAULT_TO_ACCEPT #allow everything by default > options IPV6FIREWALL #firewall for IPv6 > options IPV6FIREWALL_VERBOSE > options IPV6FIREWALL_VERBOSE_LIMIT=100 > #options IPV6FIREWALL_DEFAULT_TO_ACCEPT > > # RANDOM_IP_ID causes the ID field in IP packets to be randomized > # instead of incremented by 1 with each packet generated. This > # option closes a minor information leak which allows remote > # observers to determine the rate of packet generation on the > # machine by watching the counter. > options RANDOM_IP_ID > 153c170 < device apm0 at nexus? disable flags 0x20 # Advanced Power Management --- > #device apm0 at nexus? disable flags 0x20 # Advanced Power Management 165a183,187 > options BREAK_TO_DEBUGGER # a BREAK on a comconsole goes to > # DDB, if available. > options CONSPEED=115200 # speed for serial console > # (default 9600) > 179a202,205 > > # ETinc > device eth0 > #device bw0 at isa ? Scot To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message