Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 30 Nov 2009 15:47:09 +0800
From:      Adrian Chadd <adrian@freebsd.org>
To:        =?ISO-8859-1?Q?Eirik_=D8verby?= <ltning@anduin.net>
Cc:        pyunyh@gmail.com, weldon@excelsusphoto.com, freebsd-current@freebsd.org, Robert Watson <rwatson@freebsd.org>, Gavin Atkinson <gavin@freebsd.org>
Subject:   Re: FreeBSD 8.0 - network stack crashes?
Message-ID:  <d763ac660911292347i74caba25h9861a4d9feb63d77@mail.gmail.com>
In-Reply-To: <E9B13DDC-1B51-4EFD-95D2-544238BDF3A4@anduin.net>
References:  <A1648B95-F36D-459D-BBC4-FFCA63FC1E4C@anduin.net> <20091129013026.GA1355@michelle.cdnetworks.com> <74BFE523-4BB3-4748-98BA-71FBD9829CD5@anduin.net> <alpine.BSF.2.00.0911291427240.80654@fledge.watson.org> <E9B13DDC-1B51-4EFD-95D2-544238BDF3A4@anduin.net>

next in thread | previous in thread | raw e-mail | index | archive | help
That URL works for me. So how much traffic is this box handling during
peak times?

I've seen this on the proxy boxes that I've setup. There's a lot of
data being tied up in socket buffers as well as being routed between
interfaces (ie, stuff that isn't being intercepted.)  Take a look at
"netstat -an" when things are locked up; see if there's any sockets
which have full send/receive queues.

I'm going to take a complete stab in the dark here and say this sounds
a little like a livelock. Ie, something is queuing data and allocating
mbufs for TX (and something else is generating mbufs - I dunno, packet
headers?) far faster than the NIC is able to TX them out, and there's
not enough backpressure on whatever (say, the stuff filling socket
buffers) to stop the mbuf exhaustion. Again, I've seen this kind of
crap on proxy boxes.

See if you have full socket buffers showing up in netstat -an. Have
you tweaked the socket/TCP send/receive sizes? I typically lock mine
down to something small (32k-64k for the most part) so I don't hit
mbuf exhaustion on very busy proxies.

2c,



Adrian

2009/11/30 Eirik =D8verby <ltning@anduin.net>:
> On 29. nov. 2009, at 15.29, Robert Watson wrote:
>
>> On Sun, 29 Nov 2009, Eirik =D8verby wrote:
>>
>>> I just did that (-rxcsum -txcsum -tso), but the numbers still keep risi=
ng. I'll wait and see if it goes down again, then reboot with those values =
to see how it behaves. But right away it doesn't look too good ..
>>
>> It would be interesting to know if any of the counters in the output of =
netstat -s grow linearly with the allocation count in netstat -m. =A0Often =
times leaks are associated with edge cases in the stack (typically because =
if they are in common cases the bug is detected really quickly!) -- usually=
 error handling, where in some error case the unwinding fails to free an mb=
uf that it should free. =A0These are notoriously hard to track down, unfort=
unately, but the stats output (especially where delta alloc is linear to de=
lta stat) may inform the situation some more.
>
> From what I can tell, all that goes up with mbuf usage is traffic/packet =
counts. I can't say I see anything fishy in there.
>
> From the last few samples in
> http://anduin.net/~ltning/netstat.log
> you can see the host stops receiving any packets, but does a few retransm=
its before the session where this script ran timed out.
>
> /Eirik
>
> _______________________________________________
> freebsd-current@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org=
"
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?d763ac660911292347i74caba25h9861a4d9feb63d77>