From owner-freebsd-net@FreeBSD.ORG Wed Oct 7 09:22:57 2009 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CEE9B1065701; Wed, 7 Oct 2009 09:22:57 +0000 (UTC) (envelope-from rihad@mail.ru) Received: from mx76.mail.ru (mx76.mail.ru [94.100.176.91]) by mx1.freebsd.org (Postfix) with ESMTP id 822228FC14; Wed, 7 Oct 2009 09:22:57 +0000 (UTC) Received: from [217.25.27.27] (port=57345 helo=[217.25.27.27]) by mx76.mail.ru with asmtp id 1MvSjX-000FWG-00; Wed, 07 Oct 2009 13:22:55 +0400 Message-ID: <4ACC5DEC.1010006@mail.ru> Date: Wed, 07 Oct 2009 14:22:52 +0500 From: rihad User-Agent: Mozilla-Thunderbird 2.0.0.22 (X11/20090706) MIME-Version: 1.0 To: Robert Watson References: <4AC9E29B.6080908@mail.ru> <20091005123230.GA64167@onelab2.iet.unipi.it> <4AC9EFDF.4080302@mail.ru> <4ACA2CC6.70201@elischer.org> <4ACAFF2A.1000206@mail.ru> <4ACB0C22.4000008@mail.ru> <20091006100726.GA26426@svzserv.kemerovo.su> <4ACB42D2.2070909@mail.ru> <20091006142152.GA42350@svzserv.kemerovo.su> <4ACB6223.1000709@mail.ru> <20091006161240.GA49940@svzserv.kemerovo.su> <4ACC5563.602@mail.ru> <4ACC56A6.1030808@mail.ru> In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit X-Spam: Not detected X-Mras: Ok Cc: freebsd-net@freebsd.org, Eugene Grosbein , Luigi Rizzo , Julian Elischer Subject: Re: dummynet dropping too many packets X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Oct 2009 09:22:57 -0000 Robert Watson wrote: > On Wed, 7 Oct 2009, rihad wrote: > >> rihad wrote: >>> I've yet to test how this direct=0 improves extensive dummynet drops. >> >> Ooops... After a couple of minutes, suddenly: >> >> net.inet.ip.intr_queue_drops: 1284 >> >> Bumped it up a bit. > > Yes, I was going to suggest that moving to deferred dispatch has > probably simply moved the drops to a new spot, the queue between the > ithreads and the netisr thread. In your setup, how many network > interfaces are in use, and what drivers? > bce -- Broadcom NetXtreme II (BCM5706/BCM5708) PCI/PCIe Gigabit Ethernet adapter driver device bce compiled into a 7.1-RELEASE-p8 kernel. 2 network cards: bce0 used for ~400-500 mbit/s input, bce1 for output, i.e. acting as a smart router. It has 2 quad core CPUs. Now the probability of drops (as monitored by netstat -s's "output packets dropped due to no bufs, etc.") is definitely a function of traffic load and the number of items in a ipfw table. I've just decreased the size of the two tables from ~2600 to ~1800 each and the drops instantly went away, even though the traffic passing through the box didn't decrease, it even increased a bit due to now shaping fewer clients (luckily "ipfw pipe tablearg" passes packets failing a table lookup untouched). > If what's happening is that you're maxing out a CPU then moving to > multiple netisrs might help if your card supports generating flow IDs, > but most lower-end cards don't. I have patches to generate those flow > IDs in software rather than hardware, but there are some downsides to > doing so, not least that it takes cache line misses on the packet that > generally make up a lot of the cost of processing the packet. > > My experience with most reasonable cards is that letting them doing the > work distribution with RSS and use multiple ithreads is a more > performant strategy than using software work distribution on current > systems, though. > So should we prefer a bunch of expensive quality 10 gig cards? Any you would recommend? > Someone has probably asked for this already, but -- could you send a > snapshot of the top -SH output in the steady state? Let top run for a > few minutes and then copy/paste the first 10-20 lines into an e-mail. > Sure. Mind you: now there's only 1800 entries in each of the two ipfw tables, so any drops have stopped. But it only takes another 200-300 entries to start dropping. 155 processes: 10 running, 129 sleeping, 16 waiting CPU: 2.4% user, 0.0% nice, 2.0% system, 9.3% interrupt, 86.2% idle Mem: 1691M Active, 1491M Inact, 454M Wired, 130M Cache, 214M Buf, 170M Free Swap: 2048M Total, 12K Used, 2048M Free PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 15 root 171 ki31 0K 16K CPU3 3 22.4H 97.85% idle: cpu3 14 root 171 ki31 0K 16K CPU4 4 23.0H 96.29% idle: cpu4 12 root 171 ki31 0K 16K CPU6 6 23.8H 94.58% idle: cpu6 16 root 171 ki31 0K 16K CPU2 2 22.5H 90.72% idle: cpu2 13 root 171 ki31 0K 16K CPU5 5 23.4H 90.58% idle: cpu5 18 root 171 ki31 0K 16K RUN 0 20.3H 85.60% idle: cpu0 17 root 171 ki31 0K 16K CPU1 1 910:03 78.37% idle: cpu1 11 root 171 ki31 0K 16K CPU7 7 23.8H 65.62% idle: cpu7 21 root -44 - 0K 16K CPU7 7 19:03 48.34% swi1: net 29 root -68 - 0K 16K WAIT 1 515:49 19.63% irq256: bce0 31 root -68 - 0K 16K WAIT 2 56:05 5.52% irq257: bce1 19 root -32 - 0K 16K WAIT 5 50:05 3.86% swi4: clock sio 983 flowtools 44 0 12112K 6440K select 0 13:20 0.15% flow-capture 465 root -68 - 0K 16K - 3 51:19 0.00% dummynet 3 root -8 - 0K 16K - 1 7:41 0.00% g_up 4 root -8 - 0K 16K - 2 7:14 0.00% g_down 30 root -64 - 0K 16K WAIT 6 5:30 0.00% irq16: mfi0