From owner-freebsd-net@FreeBSD.ORG Tue May 14 01:10:41 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id AEC9D4B9 for ; Tue, 14 May 2013 01:10:41 +0000 (UTC) (envelope-from barney_cordoba@yahoo.com) Received: from nm19-vm3.bullet.mail.ne1.yahoo.com (nm19-vm3.bullet.mail.ne1.yahoo.com [98.138.91.149]) by mx1.freebsd.org (Postfix) with ESMTP id 7597A61C for ; Tue, 14 May 2013 01:10:41 +0000 (UTC) Received: from [98.138.90.52] by nm19.bullet.mail.ne1.yahoo.com with NNFMP; 14 May 2013 01:08:50 -0000 Received: from [98.138.87.4] by tm5.bullet.mail.ne1.yahoo.com with NNFMP; 14 May 2013 01:08:50 -0000 Received: from [127.0.0.1] by omp1004.mail.ne1.yahoo.com with NNFMP; 14 May 2013 01:08:50 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 862916.61777.bm@omp1004.mail.ne1.yahoo.com Received: (qmail 57983 invoked by uid 60001); 14 May 2013 01:08:50 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1368493730; bh=usAoBF5wCf772eFoRqH6WsnTFg5UVkN7vSYQOVDJBYY=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=lq2sF62RX/U4mfP9sQRfUahfS1pCLPi3ELKxgQ4gb76rQpxqGi6w4vf5NS/I5qTfbFhg4Bo0qvB+eOWxYrpTalXJYUQuEjWpf8pOIZ733vmXP5bAszY+fODWnedkyJh9gwGEiZt3OjCdn1D4JIfM5shdlaPkTswNd3Gl1nms2DA= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=HeeV+BIjOrmu7e5eT5baLMmvxSLm7XGK+YxyHCePKtD2ByTanRsX9hDZ7ZXe96TT4j4SmNFKC0whG9lYrA+1hSlrI4tLOzQk1wbw9Q9C1nWjmD3zb0l9O/GRdtYNEpTvhbySbhrGL8fUbkRPn9YctDOEaSTloMHmGGEOcltZSj0=; X-YMail-OSG: rutnDekVM1lPT62pv7ghXyxw_R1P50Ii7A1sTtI9k.wKGno asrUtBTIEaJ4I2vM9PpPySmpOtdWSDLoZY34htCnLqRNKCXZU4zLrAad2r3i 49d4DwPUOYpd1Y4iGVijgpzfQMhD.wVnEaiE2rosSN46wgBJEWmAwTQUMzCa 1Zzvlml2KLXbgs4FjlrUhwdK6cjxtUGlIokJve0wr0lUJSDaahbj05sebt3L NlKHHLW1rSq.rLmCF4vKLsGLKCQo96ChnATW1klRn19X6t8bYp3Q8GMqQImH NsVtQgcmEFC54cLsRMDrUKp0RyapuVWYdVkiJezE2S.ttixfGCzDb4_sqXm7 fY_PKGlNXIVEmqagO2x83kW6x4vnEPTokZYZpA48i5JKmkcDij3enUR3vrrF MBjouejBI.xPvGSYuhL6Jcipp.VRjpa5srxxrCHA3u9BXo7Rv3Il7udodFL8 4TbQUbGc_L4db6hFyU_hePg3765gV5zI935aF4saK9za8NWjDYl64MS02MDs cYKGgZBAsuszbp3nzE_sX5CSjhmj.GFJu7ahcKUMGiQ-- Received: from [98.203.118.124] by web121601.mail.ne1.yahoo.com via HTTP; Mon, 13 May 2013 18:08:50 PDT X-Rocket-MIMEInfo: 002.001, WW91IGhhdmUgdG8gYWRtaXQgdGhlcmUncyBhIHByb2JsZW0gYmVmb3JlIHlvdSBjYW4gZml4IGl0LiBJZiBFdWdlbmUgaXMgDQpnb2luZyB0byBibGFtZSB0byBib3R0bGVuZWNrIGFuZCBubyBvbmUgaXMgZ29pbmcgdG8gdGVsbCBoaW0gaGUncyB3cm9uZywNCnRoZW4gdGhlcmUgaXMgbm8gZGlzY3Vzc2lvbi4NCg0KVGhlIHNvbHV0aW9uIGluIHRoaXMgY2FzZSBpcyB0byB1c2UgMSBxdWV1ZSwgd2hpY2ggd2FzIG15IHN1Z2dlc3Rpb24NCm1hbnkgZGF5cyBhZ28uIA0KDQpUaGUgZGVmYXVsdHMgYXJlIGJyb2sBMAEBAQE- X-Mailer: YahooMailClassic/15.1.8 YahooMailWebService/0.8.141.536 Message-ID: <1368493730.55723.YahooMailClassic@web121601.mail.ne1.yahoo.com> Date: Mon, 13 May 2013 18:08:50 -0700 (PDT) From: Barney Cordoba Subject: Re: High CPU interrupt load on intel I350T4 with igb on 8.3 To: Hooman Fazaeli , Adrian Chadd In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-net@freebsd.org, =?iso-8859-1?Q?Cl=E9ment_Hermann_=28nodens=29?= , Eugene Grosbein X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 May 2013 01:10:41 -0000 You have to admit there's a problem before you can fix it. If Eugene is=20 going to blame to bottleneck and no one is going to tell him he's wrong, then there is no discussion. The solution in this case is to use 1 queue, which was my suggestion many days ago.=20 The defaults are broken. The driver should default to 1 queue, and be tuned to the system environment. With 2 NICs in the box, the defaults are defective.=20 1 queue should always work. Other settings require tuning and an understanding of how things work.=20 I've had to support i350 so I've been playing with the driver a bit. It=20 works fine with lots of cores. But you have to have more cores than queues. 2 cards with 4 queues on a 6 physical core system gets into a contention problem at certain loads. I've also removed the cpu bindings, which is about all I'm free to disclose= . The driver needs a tuning doc as much as anything else. BC --- On Sat, 5/11/13, Adrian Chadd wrote: > From: Adrian Chadd > Subject: Re: High CPU interrupt load on intel I350T4 with igb on 8.3 > To: "Hooman Fazaeli" > Cc: "Barney Cordoba" , ""Cl=E9ment Hermann (nod= ens)"" , "Eugene Grosbein" , freeb= sd-net@freebsd.org > Date: Saturday, May 11, 2013, 6:16 PM > Hi, >=20 > The motivation behind the locking scheme in igb in friends > is for a > very specific, userland-traffic-origin workload. >=20 > Sure, it may or may not work well for forwarding/filtering > workloads. >=20 > If you want to fix it, let's have a discussion about how to > do it, > followed by some patches to do so. >=20 >=20 >=20 >=20 > Adrian >=20 > On 11 May 2013 13:12, Hooman Fazaeli > wrote: > > On 5/11/2013 8:26 PM, Barney Cordoba wrote: > >> Clearly you don't understand the problem. Your > logic is that because other drivers are defective also; > therefore its not a driver problem? The problem is caused by > a multi-threaded driver that > >> haphazardly launches tasks and that doesn't manage > the case that the rest of the system can't handle the load. > It's no different than a driver that barfs when mbuf > clusters are exhausted. The answer > >> isn't to increase memory or mbufs, even though that > may alleviate the problem. The answer is to fix the driver, > so that it doesn't crash the system for an event that is > wholly predictable. igb has > >> 1) too many locks and 2) exasperates the problem by > binding to cpus, which causes it to not only have to wait > for the lock to free, but also for a specific cpu to become > free. So it chugs along > >> happily until it encounters a bottleneck, at which > point it quickly blows up the entire system in a domino > effect. It needs to manage locks more efficiently, and also > to detect when the backup is > >> unmanageable. Ever since FreeBSD 5 the answer has > been "it's fixed in 7, or its fixed in 9, or it's fixed in > 10". There will always be bottlenecks, and no driver should > blow up the system no matter > >> what intermediate code may present a problem. Its > the driver's responsibility to behave and to drop packets if > necessary. BC > > > > And how the driver should behave? You suggest dropping > the packets. Even if we accept > > that dropping packets is a good strategy in all > configurations (which I doubt), the driver is > > definitely not the best place to implement it, since > that involves duplication of similar > > code between drivers. Somewhere like the Ethernet layer > is a much better choice to watch > > load of packets and drop them to prevent them to eat > all the cores. Furthermore, ignoring > > the fact that pf is not optimized for multi-processors > and blaming drivers for not adjusting > > themselves with the this pf's fault, is a bit unfair, I > believe. > > > > > > -- > > > > Best regards. > > Hooman Fazaeli > > > > _______________________________________________ > > freebsd-net@freebsd.org > mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-net > > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-net@freebsd.org > mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >