Date: Mon, 15 Feb 2010 14:05:02 +0100 From: Ivan Voras <ivoras@freebsd.org> To: freebsd-stable@freebsd.org Cc: freebsd-net@freebsd.org Subject: Re: Sudden mbuf demand increase and shortage under the load Message-ID: <hlbgpr$sjj$1@ger.gmane.org> In-Reply-To: <4B793D1D.1000108@FreeBSD.org> References: <4B793D1D.1000108@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 02/15/10 13:25, Maxim Sobolev wrote: > Hi, > > Our company have a FreeBSD based product that consists of the numerous > interconnected processes and it does some high-PPS UDP processing > (30-50K PPS is not uncommon). We are seeing some strange periodic I have nothing very useful to help you with but maybe you can detect if it's a em/igp issue by buying a cheap Realtek gigabit (re) card and trying it out. Those can be bought for a few dollars now (e.g. from D-Link and many others), and I can confirm that at least the one I tried can carry around 50K pps, but not much more (I can tell you the exact chip later today if you are interested). > failures under the load in several such systems, which usually evidences > itself in IPC (even through unix domain sockets) suddenly either > breaking down or pausing and restoring only some time later (like 5-10 > minutes). The only sign of failure I managed to find was the increase of > the "requests for mbufs denied" in the netstat -m and number of total > mbuf clusters (nmbclusters) raising up to the limit. > > I have tried to raise some network-related limits (most notably maxusers > and nmbclusters), but it has not helped with the issue - it's still > happening from time to time to us. Below you can find output from the > netstat -m few minutes right after that shortage period - you see that > somehow the system has allocated huge amount of memory for the network > (700MB), with only tiny amount of that being actually in use. This is > for the kern.ipc.nmbclusters: 302400. Eventually the system reclaims all > that memory and goes back to its normal use of 30-70MB. > > This problem is killing us, so any suggestions are greatly appreciated. > My current hypothesis is that due to some issues either with the network > driver or network subsystem itself, the system goes insane and "eats" up > all mbufs up to nmbclusters limit. But since mbufs are shared between > network and local IPC, IPC goes down as well. > > We observe this issue with systems using both em(4) driver and igb(4) > driver. I believe both drivers share the same design, however I am not > sure if this is some kind of design flaw in the driver or part of a > larger problem with the network subsystem. > > This happens on amd64 7.2-RELEASE and 7.3-PRERELEASE alike, with 8GB of > memory. I have not tried upgrading to 8.0, this is production system so > upgrading will not be easy. I don't believe there are some differences > that let us hope that this problem will go away after upgrade, but I can > try it as the last resort. > > As I said, this is very critical issue, so I can provide any additional > debug information upon request. We are ready to go as far as paying > somebody reasonable amount of money for tracking down and resolving the > issue. > > Regards,
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?hlbgpr$sjj$1>