From owner-freebsd-net@FreeBSD.ORG  Tue Nov  1 05:16:40 2011
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1961B106783D
	for <freebsd-net@freebsd.org>; Tue,  1 Nov 2011 05:16:21 +0000 (UTC)
	(envelope-from pprocacci@datapipe.com)
Received: from EXFESMQ04.datapipe-corp.net (exfesmq04.datapipe.com
	[64.27.120.68]) by mx1.freebsd.org (Postfix) with ESMTP id A3C348FC14
	for <freebsd-net@freebsd.org>; Tue,  1 Nov 2011 05:16:18 +0000 (UTC)
Received: from nat.myhome (192.168.128.103) by EXFESMQ04.datapipe-corp.net
	(192.168.128.29) with Microsoft SMTP Server (TLS) id 14.1.339.1;
	Tue, 1 Nov 2011 01:16:17 -0400
Date: Tue, 1 Nov 2011 00:16:37 -0500
From: "Paul A. Procacci" <pprocacci@datapipe.com>
To: <freebsd-net@freebsd.org>
Message-ID: <20111101051637.GC2445@nat.myhome>
References: <20111101015746.GA96508@nat.myhome>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <20111101015746.GA96508@nat.myhome>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Originating-IP: [192.168.128.103]
Content-Transfer-Encoding: quoted-printable
Subject: Re: [High Interrupt Count] Networking Difficulties
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 01 Nov 2011 05:16:40 -0000

On Mon, Oct 31, 2011 at 08:57:46PM -0500, Paul A. Procacci wrote:
> Gents,
>
> I'm having quite an aweful problem that I need a bit of help with.
>
> I have an HPDL360 G3 ( http://h18000.www1.hp.com/products/quickspecs/1150=
4_na/11504_na.HTML ) which acts as a NAT (via PF) for several (600+) class =
C's amongst 24+ machines sitting behind it.
> It's running FPSense (FreeBSD 8.1-RELEASE-p4).
>
> The important guts are:
>
> 2 x 2.8 GHz Cpus
> 2 BGE interfaces on a PCI-X bus.
>
> During peak times this machine is only able to handle between 500Mbps - 6=
00Mbps before running out of cpu capacity.  (300Mbps(ish) on the LAN, 300Mb=
ps(ish) on the WAN) It's due to the high number of interrupts.
> I was speaking with a networking engineer here and he mentioned that I sh=
ould look at "Interrupt Coalescing" to increase throughput.
> The only information I found online regarding this was a post from 2 year=
s ago here: http://lists.freebsd.org/pipermail/freebsd-net/2009-June/022227=
.html
>
> The tunables mentioned in the above post aren't present in my system, so =
I imagine this never made it into the bge driver.  Assuming this to be the =
case, I started looking at DEVICE_POLLING as a solution.
> I did try implementing device polling, but the results were worse than I =
expected.  netisr was using 100% of a single cpu while the other cpu remain=
ed mostly idle.
> Not knowing exactly what netisr is, I reverted the changes.
>
> This leads me to this list.  Given the scenario above, I'm nearly certain=
 I need to use device polling instead of the standard interrupt driven setu=
p.
> The two sysctl's that I've come across thus far that I think are what I n=
eed are:
>
> net.isr.maxthreads
> hern.hz
>
> I would assume setting net.isr.maxthreads to 2 given my dual core machine=
 is advisable, but I'm not 100% sure.
> What are the caveats in setting this higher?  Given the output of `sysctl=
 -d net.isr.maxthreads` I would expect anything higher than the number of c=
ores to be detrimental.  Is this correct?
>
> kern.hz I'm more unsure of.  I understand what the sysctl is, but I'm not=
 sure how to come up with a reasonable number.
> Generally speaking, and in your experience, would a setting of 2000 achiv=
e close to the theoritical meximum of the cards?  Is there an upper limit t=
hat I would be worried about?
>
> Random Question:
> - is device polling really the answer?  I am missing something in the bge=
 driver that I've overlooked?
> - what tunables directly effect processing high volumes of packets.
>

<snip>

After some more coffee, and source code reading, I've now learned that havi=
ng device polling enabled forces netisr to limit the number of threads it c=
reates to 1.
This kinda defeats the purpose of enabling device polling. This makes me be=
lieve that device polling isn't going to be a great solution afterall.

A snippet from dmesg:
<snip>
bge0: <Compaq NC7781 Gigabit Server Adapter, ASIC rev. 0x001002> mem 0xf7ef=
0000-0xf7efffff irq 30 at device 2.0 on pci1
brgphy0: <BCM5703 10/100/1000baseTX PHY> PHY 1 on miibus0
bge1: <Compaq NC7781 Gigabit Server Adapter, ASIC rev. 0x001002> mem 0xf7ff=
0000-0xf7ffffff irq 29 at device 2.0 on pci4
brgphy1: <BCM5703 10/100/1000baseTX PHY> PHY 1 on miibus1
<snip>

Any help/advice is appreciated, and sorry for following up to myself with t=
his information.

~Paul

________________________________

This message may contain confidential or privileged information. If you are=
 not the intended recipient, please advise us immediately and delete this m=
essage. See http://www.datapipe.com/about-us-legal-email-disclaimer.htm for=
 further information on confidentiality and the risks of non-secure electro=
nic communication. If you cannot access these links, please notify us by re=
ply message and we will send the contents to you.