From owner-freebsd-stable@FreeBSD.ORG  Mon Feb 15 13:05:24 2010
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 814D51065672
	for <freebsd-stable@freebsd.org>; Mon, 15 Feb 2010 13:05:24 +0000 (UTC)
	(envelope-from freebsd-stable@m.gmane.org)
Received: from lo.gmane.org (lo.gmane.org [80.91.229.12])
	by mx1.freebsd.org (Postfix) with ESMTP id 066058FC0A
	for <freebsd-stable@freebsd.org>; Mon, 15 Feb 2010 13:05:23 +0000 (UTC)
Received: from list by lo.gmane.org with local (Exim 4.69)
	(envelope-from <freebsd-stable@m.gmane.org>) id 1Nh0dc-000550-Ph
	for freebsd-stable@freebsd.org; Mon, 15 Feb 2010 14:05:20 +0100
Received: from lara.cc.fer.hr ([161.53.72.113])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-stable@freebsd.org>; Mon, 15 Feb 2010 14:05:20 +0100
Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-stable@freebsd.org>; Mon, 15 Feb 2010 14:05:20 +0100
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-stable@freebsd.org
From: Ivan Voras <ivoras@freebsd.org>
Date: Mon, 15 Feb 2010 14:05:02 +0100
Lines: 54
Message-ID: <hlbgpr$sjj$1@ger.gmane.org>
References: <4B793D1D.1000108@FreeBSD.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Complaints-To: usenet@ger.gmane.org
X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.1.5) Gecko/20100118 Thunderbird/3.0
In-Reply-To: <4B793D1D.1000108@FreeBSD.org>
Sender: news <news@ger.gmane.org>
Cc: freebsd-net@freebsd.org
Subject: Re: Sudden mbuf demand increase and shortage under the load
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Feb 2010 13:05:24 -0000

On 02/15/10 13:25, Maxim Sobolev wrote:
> Hi,
>
> Our company have a FreeBSD based product that consists of the numerous
> interconnected processes and it does some high-PPS UDP processing
> (30-50K PPS is not uncommon). We are seeing some strange periodic

I have nothing very useful to help you with but maybe you can detect if 
it's a em/igp issue by buying a cheap Realtek gigabit (re) card and 
trying it out. Those can be bought for a few dollars now (e.g. from 
D-Link and many others), and I can confirm that at least the one I tried 
can carry around 50K pps, but not much more (I can tell you the exact 
chip later today if you are interested).

> failures under the load in several such systems, which usually evidences
> itself in IPC (even through unix domain sockets) suddenly either
> breaking down or pausing and restoring only some time later (like 5-10
> minutes). The only sign of failure I managed to find was the increase of
> the "requests for mbufs denied" in the netstat -m and number of total
> mbuf clusters (nmbclusters) raising up to the limit.
>
> I have tried to raise some network-related limits (most notably maxusers
> and nmbclusters), but it has not helped with the issue - it's still
> happening from time to time to us. Below you can find output from the
> netstat -m few minutes right after that shortage period - you see that
> somehow the system has allocated huge amount of memory for the network
> (700MB), with only tiny amount of that being actually in use. This is
> for the kern.ipc.nmbclusters: 302400. Eventually the system reclaims all
> that memory and goes back to its normal use of 30-70MB.
>
> This problem is killing us, so any suggestions are greatly appreciated.
> My current hypothesis is that due to some issues either with the network
> driver or network subsystem itself, the system goes insane and "eats" up
> all mbufs up to nmbclusters limit. But since mbufs are shared between
> network and local IPC, IPC goes down as well.
>
> We observe this issue with systems using both em(4) driver and igb(4)
> driver. I believe both drivers share the same design, however I am not
> sure if this is some kind of design flaw in the driver or part of a
> larger problem with the network subsystem.
>
> This happens on amd64 7.2-RELEASE and 7.3-PRERELEASE alike, with 8GB of
> memory. I have not tried upgrading to 8.0, this is production system so
> upgrading will not be easy. I don't believe there are some differences
> that let us hope that this problem will go away after upgrade, but I can
> try it as the last resort.
>
> As I said, this is very critical issue, so I can provide any additional
> debug information upon request. We are ready to go as far as paying
> somebody reasonable amount of money for tracking down and resolving the
> issue.
>
> Regards,