From owner-freebsd-net@FreeBSD.ORG Thu May 20 04:33:53 2010 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 344E4106567B for ; Thu, 20 May 2010 04:33:53 +0000 (UTC) (envelope-from rihad@mail.ru) Received: from mx33.mail.ru (mx33.mail.ru [94.100.176.47]) by mx1.freebsd.org (Postfix) with ESMTP id E31E88FC15 for ; Thu, 20 May 2010 04:33:52 +0000 (UTC) Received: from [217.25.27.27] (port=63210 helo=[217.25.27.27]) by mx33.mail.ru with asmtp id 1OExSA-000M63-00; Thu, 20 May 2010 08:33:51 +0400 Message-ID: <4BF4BBB2.8030806@mail.ru> Date: Thu, 20 May 2010 09:33:54 +0500 From: rihad User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.9) Gecko/20100423 Thunderbird/3.0.4 MIME-Version: 1.0 To: Eugene Grosbein References: <4BF4252F.8000208@mail.ru> <20100519190505.GA29133@rdtc.ru> In-Reply-To: <20100519190505.GA29133@rdtc.ru> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam: Not detected X-Mras: Ok Cc: freebsd-net@freebsd.org Subject: Re: increasing em(4) buffer sizes X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 May 2010 04:33:53 -0000 On 05/20/2010 12:05 AM, Eugene Grosbein wrote: > On Wed, May 19, 2010 at 10:51:43PM +0500, rihad wrote: > >> We have a FreeBSD 7.2 Intel Server System 4GB RAM box doing traffic >> shaping and accounting. It has two em gigabit interfaces: one used for >> input, the other for output, servicing around 500-600 mbps load through >> it. Traffic limiting is accomplished by dynamically setting up IPFW >> pipes, which in turn work fine for our per-user traffic accounting needs >> thanks to byte counters. So the firewall is basically a longish string >> of pipe rules. This worked fine when the number of online users was low, >> but now, as we've slowly begun servicing 2-3K online users netstat -i's >> Ierrs column is growing at a rate of 5-15K per hour for em0, the >> interface used for input. Apparently searching through the firewall >> linearly for _each_ arriving packet locks the interface for the duration >> of the search (even though net.isr.direct=0), so some packets are >> periodically dropped on input. To mitigate the problem I've set up a >> two-level hash by means of skipto rules, dropping the number of up to >> several thousand rules to be searched for each packet to a mere 85 max, >> but the rate of Ierrs has only increased to 40-50K per hour, I don't >> know why. I've also tried setting these sysctls: > > First, read: http://www.intel.com/design/network/applnots/ap450.htm > You'll see you may be restricted with your NIC's chip capabilities. > Likely sooner than later these cards will be upgraded to 10 GigE ones, I just want to make sure that the delays imposed by traversing the firewall never cause traffic drops on input. > There are loader tunnables, set them in /etc/loader.conf: Do you mean /boot/loader.conf ? > > hw.em.rxd=4096 > hw.em.txd=4096 > BTW, I can't read the current value: $ sysctl hw.em.rxd sysctl: unknown oid 'hw.em.rxd' $ Is this a write-only value? :) > The price is amount of kernel memory the driver may consume. > Maxumum MTU=16110 for em(4), so it can consume about 64Mb of kernel memory > for that long input buffer, in theory. > > Some more useful tunnables for loader.conf: > > dev.em.0.rx_int_delay=200 > dev.em.0.tx_int_delay=200 > dev.em.0.rx_abs_int_delay=200 > dev.em.0.tx_abs_int_delay=200 > dev.em.0.rx_processing_limit=-1 > So this interrupt delay is the much talked about interrupt moderation? Thanks, I'll try them. Is there any risk the machine won't boot with them if rebooted remotely? > Alternatively, you may try kernel polling (ifconfig em0 polling) > with other tunnables: > > kern.hz=4000 # for /boot/loader.conf > kern.polling.burst_max=1000 # for /etc/sysctl.conf > kern.polling.each_burst=500 > Wow, I successfully used polling a couple of years ago when the load was low, but then I read some posting on this list claiming that Intel cards have the ability to do fast-interrupts (interrupt moderation), but for that DEVICE_POLLING needs to be out of the kernel. So I scratched it and rebuilt the kernel for no apparent reason. Maybe you're right, polling would've worked just fine, so I may go back to that too. > Eugene Grosbein > >