Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 30 Nov 2009 08:20:57 +0100
From:      =?iso-8859-1?Q?Eirik_=D8verby?= <ltning@anduin.net>
To:        pyunyh@gmail.com
Cc:        weldon@excelsusphoto.com, Gavin Atkinson <gavin@freebsd.org>, Robert Watson <rwatson@freebsd.org>, freebsd-current@freebsd.org
Subject:   Re: FreeBSD 8.0 - network stack crashes?
Message-ID:  <7A7E32A2-9320-4F39-B495-70E547D23B82@anduin.net>
In-Reply-To: <20091130005236.GC1123@michelle.cdnetworks.com>
References:  <A1648B95-F36D-459D-BBC4-FFCA63FC1E4C@anduin.net> <20091129013026.GA1355@michelle.cdnetworks.com> <74BFE523-4BB3-4748-98BA-71FBD9829CD5@anduin.net> <alpine.BSF.2.00.0911291427240.80654@fledge.watson.org> <E9B13DDC-1B51-4EFD-95D2-544238BDF3A4@anduin.net> <20091130005236.GC1123@michelle.cdnetworks.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 30. nov. 2009, at 01.52, Pyun YongHyeon wrote:

> On Mon, Nov 30, 2009 at 12:21:16AM +0100, Eirik ??verby wrote:
>> On 29. nov. 2009, at 15.29, Robert Watson wrote:
>>=20
>>> On Sun, 29 Nov 2009, Eirik =D8verby wrote:
>>>=20
>>>> I just did that (-rxcsum -txcsum -tso), but the numbers still keep =
rising. I'll wait and see if it goes down again, then reboot with those =
values to see how it behaves. But right away it doesn't look too good ..
>>>=20
>>> It would be interesting to know if any of the counters in the output =
of netstat -s grow linearly with the allocation count in netstat -m.  =
Often times leaks are associated with edge cases in the stack (typically =
because if they are in common cases the bug is detected really quickly!) =
-- usually error handling, where in some error case the unwinding fails =
to free an mbuf that it should free.  These are notoriously hard to =
track down, unfortunately, but the stats output (especially where delta =
alloc is linear to delta stat) may inform the situation some more.
>>=20
>> =46rom what I can tell, all that goes up with mbuf usage is =
traffic/packet counts. I can't say I see anything fishy in there.
>>=20
>=20
> If system exhausted all available mbufs it still should not crash
> the box. Use -d option of netstat(1) to see whether packet drop
> counter still goes up when you know system can't receive any
> frames. AFAIK em(4) was carefully written to recover from Rx
> resource shortage such that it just drops incoming frames when it
> can't get new mbuf. This may result in dropping incoming connection
> request but it means it still tries to recover from the resource
> exhaustion.
> It's not clear where mbuf leak comes from, though.

The box does not crash; connecting to the console (via IP-KVM) shows the =
box is just fine, except that no networking works. I can up the =
kern.ipc.nmbclusters value from the commandline, and after a few seconds =
things start moving again.

The em(4) debug output shows that it fails to allocate mbuf clusters.


>> =46rom the last few samples in
>> http://anduin.net/~ltning/netstat.log
>=20
> 404

Uh? Unpossible :)
The file is there, and I can view it here ...


>> you can see the host stops receiving any packets, but does a few =
retransmits before the session where this script ran timed out.
>>=20
>=20
> By chance do you use pf/ipfw/ipf?

No... Unfortunately ;)

/Eirik=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7A7E32A2-9320-4F39-B495-70E547D23B82>