Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 20 May 2010 02:05:05 +0700
From:      Eugene Grosbein <eugen@grosbein.pp.ru>
To:        rihad <rihad@mail.ru>
Cc:        freebsd-net@freebsd.org
Subject:   Re: increasing em(4) buffer sizes
Message-ID:  <20100519190505.GA29133@rdtc.ru>
In-Reply-To: <4BF4252F.8000208@mail.ru>
References:  <4BF4252F.8000208@mail.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, May 19, 2010 at 10:51:43PM +0500, rihad wrote:

> We have a FreeBSD 7.2 Intel Server System 4GB RAM box doing traffic 
> shaping and accounting. It has two em gigabit interfaces: one used for 
> input, the other for output, servicing around 500-600 mbps load through 
> it. Traffic limiting is accomplished by dynamically setting up IPFW 
> pipes, which in turn work fine for our per-user traffic accounting needs 
> thanks to byte counters. So the firewall is basically a longish string 
> of pipe rules. This worked fine when the number of online users was low, 
> but now, as we've slowly begun servicing 2-3K online users netstat -i's 
> Ierrs column is growing at a rate of 5-15K per hour for em0, the 
> interface used for input. Apparently searching through the firewall 
> linearly for _each_ arriving packet locks the interface for the duration 
> of the search (even though net.isr.direct=0), so some packets are 
> periodically dropped on input. To mitigate the problem I've set up a 
> two-level hash by means of skipto rules, dropping the number of up to 
> several thousand rules to be searched for each packet to a mere 85 max, 
> but the rate of Ierrs has only increased to 40-50K per hour, I don't 
> know why. I've also tried setting these sysctls:

First, read: http://www.intel.com/design/network/applnots/ap450.htm
You'll see you may be restricted with your NIC's chip capabilities.

> hw.intr_storm_threshold=10000
> dev.em.0.rx_processing_limit=3000
> 
> but they didn't help at all. BTW, the other current settings are:
> kern.hz=4000
> net.inet.ip.fw.verbose=0
> kern.ipc.nmbclusters=111111
> net.inet.ip.fastforwarding=1
> net.inet.ip.dummynet.io_fast=1
> net.isr.direct=0
> net.inet.ip.intr_queue_maxlen=5000
> 
> net.inet.ip.intr_queue_drops is always zero.
> 
> I think the problem lies in the buffer size of em not being large enough 
> to buffer the packets as they're arriving. I looked in 
> /sys/dev/e1000/if_em.c and found this:
> 
> in em_attach():
>         adapter->rx_buffer_len = 2048;
> 
> and later in em_initialize_receive_unit():
>         switch (adapter->rx_buffer_len) {
>         default:
>         case 2048:
>                 rctl |= E1000_RCTL_SZ_2048;
>                 break;
>         case 4096:
>                 rctl |= E1000_RCTL_SZ_4096 |
>                     E1000_RCTL_BSEX | E1000_RCTL_LPE;
>                 break;
>         case 8192:
>                 rctl |= E1000_RCTL_SZ_8192 |
>                     E1000_RCTL_BSEX | E1000_RCTL_LPE;
>                 break;
>         case 16384:
>                 rctl |= E1000_RCTL_SZ_16384 |
>                     E1000_RCTL_BSEX | E1000_RCTL_LPE;
>                 break;
>         }
> 
> 
> So apparently the default buffer size is 2048 bytes, and as much as 
> 16384 is supported. But at what price? Those constants do look 
> suspicious. Can I blindly change rx_buffer_len in em_attach()? Sorry, 
> I'm not a kernel hacker :(

There are loader tunnables, set them in /etc/loader.conf:

hw.em.rxd=4096
hw.em.txd=4096

The price is amount of kernel memory the driver may consume.
Maxumum MTU=16110 for em(4), so it can consume about 64Mb of kernel memory
for that long input buffer, in theory.

Some more useful tunnables for loader.conf:

dev.em.0.rx_int_delay=200
dev.em.0.tx_int_delay=200
dev.em.0.rx_abs_int_delay=200
dev.em.0.tx_abs_int_delay=200
dev.em.0.rx_processing_limit=-1

Alternatively, you may try kernel polling (ifconfig em0 polling)
with other tunnables:

kern.hz=4000				# for /boot/loader.conf
kern.polling.burst_max=1000		# for /etc/sysctl.conf
kern.polling.each_burst=500

Eugene Grosbein



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100519190505.GA29133>