Date: Tue, 3 Jan 2012 09:16:03 +0000 (GMT) From: Robert Watson <rwatson@FreeBSD.org> To: Maxim Sobolev <sobomax@FreeBSD.org> Cc: freebsd-net@freebsd.org, "Bjoern A. Zeeb" <bz@FreeBSD.ORG>, Jack Vogel <jfvogel@gmail.com> Subject: Re: Panic in the udp_input() under heavy load Message-ID: <alpine.BSF.2.00.1201030914130.34067@fledge.watson.org> In-Reply-To: <4EFE5E12.7080103@FreeBSD.org> References: <4EB804D2.2090101@FreeBSD.org> <alpine.BSF.2.00.1111071818250.4603@ai.fobar.qr> <4EB86276.6080801@sippysoft.com> <4EB86866.9060102@sippysoft.com> <alpine.BSF.2.00.1111072324340.4603@ai.fobar.qr> <4EB86FCF.3050306@FreeBSD.org> <alpine.BSF.2.00.1111080239500.1358@fledge.watson.org> <4ECEE6F0.4010301@FreeBSD.org> <F63603B1-7B35-4ECE-82E6-835CD91B93F8@FreeBSD.org> <4EFE158C.2040705@FreeBSD.org> <AB3D0536-CDD7-4595-911C-7C17FE1DFB23@FreeBSD.org> <4EFE5B70.9050807@FreeBSD.org> <4EFE5E12.7080103@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 30 Dec 2011, Maxim Sobolev wrote: > On 12/30/2011 4:46 PM, Maxim Sobolev wrote: >> I see. Would you guys mind if I put that NULL pointer check into the code >> for the time being and turn it into some kind of big nasty warning in >> 8-stable branch only? > > I could also open a ticket, put all debug information collected to date in > there. And encourage people to report to it once they see this warning on > their system. Then it would provide more information about the exposure. It > is definitely looks like locking issue somewhere, not just bad luck or flaky > hardware, as we see it happening consistently on top 4 most UDP-loaded > systems here, and it correlates well with the load. With my small NULL catch > the machines have been running happily for a month now, so there is no > visible side-effects. Please do file the PR so that all the information is in one place -- this is a network stack hacking week for me, so I should be able to take a closer look. Could you characterise the traffic load on these boxes a bit more? Also, is there regular monitoring using netstat/bsnmp/etc going on? I'd like to try and identify ways in which this workload differs from other common high-UDP workloads being used on 8.x that aren't seeing this problem... Robert
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.00.1201030914130.34067>