From owner-freebsd-hackers Tue Nov 10 11:59:03 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id LAA06826 for freebsd-hackers-outgoing; Tue, 10 Nov 1998 11:59:03 -0800 (PST) (envelope-from owner-freebsd-hackers@FreeBSD.ORG) Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id LAA06818 for ; Tue, 10 Nov 1998 11:59:01 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.9.1/8.9.1) id LAA14084; Tue, 10 Nov 1998 11:58:19 -0800 (PST) (envelope-from dillon) Date: Tue, 10 Nov 1998 11:58:19 -0800 (PST) From: Matthew Dillon Message-Id: <199811101958.LAA14084@apollo.backplane.com> To: Steven Yang Cc: "'dg@root.com'" , Mike Smith , Steven Yang , "'Open Systems Networking'" , "'freebsd-hackers@freebsd.org'" Subject: Re: RE: FW: Can't get rid of my mbufs. References: <839A86AB6CE4D111A52200104B938D430B066B@MOE> Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG :Refresher: I started this thread two weeks ago and haven't had a time to :reply until now. I'm using FreeBSD 2.2.5 (I previously stated 2.2.6, :but I was wrong) with Apache 1.2.4 using FastCGI. A typical server :response is about 22K of text, and under heavy load (20+ :requests/second), my mbufs (as seen through netstat -m) keep increasing :until the server reboots itself (when I have > ~10000 mbufs) It appears :that all of the requests are getting valid replies (we check the :returned web page for a string), even at loads around 100 :requests/second. We do not do reverse-DNS. My original question and :one of the replies is attached at the bottom of this email. : :The higher the load, the faster the mbufs increase. Under low load, the :problem does not arise and new mbufs are not allocated. I was requested :to give you guys the output of "netstat -n", as shown below. Big :questions: do I have an mbuf leak? Is it possibly my fault? Could it :be my version of FastCGI? Will upgrading my OS to 2.2.7 solve the :problem? Will upgrading Apache solve the problem? There are three issues that I can think of. I'm not sure 2.2.5 has the sysctl's to fix them (I think it does), but 2.2.7 certainly does. The mbuf's are almost certainly related to stale connections that aren't going away. This typically occurs because Apache has not turned on keepalives. You can fix this by turning on keepalives and reducing the keepalive idle test interval: sysctl -w net.inet.tcp.keepidle=1800 sysctl -w net.inet.tcp.keepintvl=150 sysctl -w net.inet.tcp.keepinit=150 sysctl -w net.inet.tcp.always_keepalive=1 NOTE: you must restart the web server after making these changes so it picks up the default The second issue could be that your default tcp window sizes are too large. The defaults are actually reasonable... 16K: sysctl -a | fgrep tcp (look for tcp.sendspace, tcp.recvspace) Check to make sure that Apache is not overriding the default window size to something huge. I don't understand why your netstat shows so few connections... are you sure apache was running at 20 hits/sec at the time you ran the netstat ? netstat -tn | fgrep tcp The third issue is the number of allocated protocol control blocks in the netstat below... looks like a kernel bug to me, but not one I've ever seen before. I would immediately upgrade the machine to 2.2.7. If this is your problem, I'll bet 2.2.7 will fix it. also do a 'ps ax' and look for hung CGI's, and try killing the server entirely (and anything else that was run from the server) and see if the space gets reclaimed. I'm thinking pipes, possibly, but dunno if pipes use network mbufs. :> > # netstat -m :> > 4449 mbufs in use: :> > 4437 mbufs allocated to data :> > 1 mbufs allocated to packet headers :> > 7 mbufs allocated to protocol control blocks :> > 4 mbufs allocated to socket names and addresses :> > 4263/4314 mbuf clusters in use :> > 9184 Kbytes allocated to network (98% in use) Here's one of our servers (doing around 30 hits/sec at the moment). Note that the in-use percentage is 48%, which is typical. If you regularly see in-use percentages above 80% it's almost certainly due to a stale-socket problem, which in turn is usually due to keepalive's being turned off and blown sockets building up. This box is running (roughly) 2.2.7. shell3:/home/dillon# netstat -m 3201 mbufs in use: 1470 mbufs allocated to data 1513 mbufs allocated to packet headers 211 mbufs allocated to protocol control blocks 7 mbufs allocated to socket names and addresses 1206/2684 mbuf clusters in use 5768 Kbytes allocated to network (48% in use) 0 requests for memory denied 0 requests for memory delayed 0 calls to protocol drain routines Matthew Dillon Engineering, HiWay Technologies, Inc. & BEST Internet Communications & God knows what else. (Please include original email in any response) To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message