From owner-freebsd-current@FreeBSD.ORG Mon Nov 30 07:21:03 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E63A21065670; Mon, 30 Nov 2009 07:21:03 +0000 (UTC) (envelope-from ltning@anduin.net) Received: from mail.anduin.net (mail.anduin.net [213.225.74.249]) by mx1.freebsd.org (Postfix) with ESMTP id A051C8FC12; Mon, 30 Nov 2009 07:21:03 +0000 (UTC) Received: from [212.62.248.150] (helo=[192.168.2.110]) by mail.anduin.net with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.69 (FreeBSD)) (envelope-from ) id 1NF0ZA-000Phg-Lz; Mon, 30 Nov 2009 08:21:00 +0100 Mime-Version: 1.0 (Apple Message framework v1077) Content-Type: text/plain; charset=iso-8859-1 From: =?iso-8859-1?Q?Eirik_=D8verby?= In-Reply-To: <20091130005236.GC1123@michelle.cdnetworks.com> Date: Mon, 30 Nov 2009 08:20:57 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <7A7E32A2-9320-4F39-B495-70E547D23B82@anduin.net> References: <20091129013026.GA1355@michelle.cdnetworks.com> <74BFE523-4BB3-4748-98BA-71FBD9829CD5@anduin.net> <20091130005236.GC1123@michelle.cdnetworks.com> To: pyunyh@gmail.com X-Mailer: Apple Mail (2.1077) Cc: weldon@excelsusphoto.com, Gavin Atkinson , Robert Watson , freebsd-current@freebsd.org Subject: Re: FreeBSD 8.0 - network stack crashes? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Nov 2009 07:21:04 -0000 On 30. nov. 2009, at 01.52, Pyun YongHyeon wrote: > On Mon, Nov 30, 2009 at 12:21:16AM +0100, Eirik ??verby wrote: >> On 29. nov. 2009, at 15.29, Robert Watson wrote: >>=20 >>> On Sun, 29 Nov 2009, Eirik =D8verby wrote: >>>=20 >>>> I just did that (-rxcsum -txcsum -tso), but the numbers still keep = rising. I'll wait and see if it goes down again, then reboot with those = values to see how it behaves. But right away it doesn't look too good .. >>>=20 >>> It would be interesting to know if any of the counters in the output = of netstat -s grow linearly with the allocation count in netstat -m. = Often times leaks are associated with edge cases in the stack (typically = because if they are in common cases the bug is detected really quickly!) = -- usually error handling, where in some error case the unwinding fails = to free an mbuf that it should free. These are notoriously hard to = track down, unfortunately, but the stats output (especially where delta = alloc is linear to delta stat) may inform the situation some more. >>=20 >> =46rom what I can tell, all that goes up with mbuf usage is = traffic/packet counts. I can't say I see anything fishy in there. >>=20 >=20 > If system exhausted all available mbufs it still should not crash > the box. Use -d option of netstat(1) to see whether packet drop > counter still goes up when you know system can't receive any > frames. AFAIK em(4) was carefully written to recover from Rx > resource shortage such that it just drops incoming frames when it > can't get new mbuf. This may result in dropping incoming connection > request but it means it still tries to recover from the resource > exhaustion. > It's not clear where mbuf leak comes from, though. The box does not crash; connecting to the console (via IP-KVM) shows the = box is just fine, except that no networking works. I can up the = kern.ipc.nmbclusters value from the commandline, and after a few seconds = things start moving again. The em(4) debug output shows that it fails to allocate mbuf clusters. >> =46rom the last few samples in >> http://anduin.net/~ltning/netstat.log >=20 > 404 Uh? Unpossible :) The file is there, and I can view it here ... >> you can see the host stops receiving any packets, but does a few = retransmits before the session where this script ran timed out. >>=20 >=20 > By chance do you use pf/ipfw/ipf? No... Unfortunately ;) /Eirik=