From owner-freebsd-net@FreeBSD.ORG Mon Sep 2 12:47:25 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 09BD0A1C for ; Mon, 2 Sep 2013 12:47:25 +0000 (UTC) (envelope-from barney_cordoba@yahoo.com) Received: from nm5-vm5.bullet.mail.ne1.yahoo.com (nm5-vm5.bullet.mail.ne1.yahoo.com [98.138.91.227]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id B6BA72C93 for ; Mon, 2 Sep 2013 12:47:24 +0000 (UTC) Received: from [98.138.90.57] by nm5.bullet.mail.ne1.yahoo.com with NNFMP; 02 Sep 2013 12:47:18 -0000 Received: from [98.138.89.170] by tm10.bullet.mail.ne1.yahoo.com with NNFMP; 02 Sep 2013 12:47:18 -0000 Received: from [127.0.0.1] by omp1026.mail.ne1.yahoo.com with NNFMP; 02 Sep 2013 12:47:18 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 69213.72948.bm@omp1026.mail.ne1.yahoo.com Received: (qmail 56592 invoked by uid 60001); 2 Sep 2013 12:47:18 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1378126037; bh=NlR6CvQ7sW4XKZvv5yuY6sYfGhYkwR8F50gQvexzxEs=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=wbUdaq/b2NfORJmBZa4Edz4kyfA75CK0oqIu7lgtSt3avuXgntAZELMHf6XzTxRAdRVseUc/lG07E62V7sKvXYUAvdXI5qFHmtTUfcL7hZANEz41wDVWD5g/nx3Khe6i7wpad6rupoM1d+3ZFPKl9Wn8XM3BadKeNpYyGLLyUcA= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=QIYPxo2bnectl5mblGEq7u0J/+0xxJsO6uHnshu6KGqY7Q9nHuGgHw9pgD7lzXfA1W3ntvrejmlglIjwRNfSuPVEDWY46L3IYP+uWNqHWBEDawpQkl7+rabR6R5hPOernb6XltxrCultkZ+qGVsmYFGyl6spSvdvJGtJR+HKAQA=; X-YMail-OSG: S1POVpkVM1lokt2FACqMN1phGlmIIfQmROHz0GUnPc36Uch hxCmxQYzLIK4_OpPTPnpzUsJGK0owRZUVWt4FupX63MFWFnoBtH6CbvUZdHQ SIZ1KD4JSTyGkJ8SSmf4Jpxc7.90vHQANsYGYzntwKfO310PDLEqkEJ_L53J PHZ74mR17feOcZK89HNeSPDog3hj3_sVdri3ew2iYm42OOzM6vfp0788SYzm ZtNXAWUqX9UqFCKXp_jLPfFIldGwuzWY8MqMLkVIcWdg3NySX8lNimJ2aX1Y _8rOtLEPyGxKHwfI5Ta2TJ2atlbIuBm5Hi3mM0lcJdcQbXJsyIhKyUejtKHk lQi.F.gMWXp1ILe4TYKGF5aQrwv1K_1U32yod4VPNcQoKH46DN9DTj5aCESy aOR2kgAgd7lsUNFcFALKRA4pB1.HutvOtwcA8pOpidKi.ynW462OIEWAY6jM ekgn24mjkKbYmTjxn4UuQm4RerU6Kw0oLRtbqOQ4f7FM6A8OdyobCmSAkASR _xY7ZCYGi2L2MNi3FC7W7aiF9o2n6mq2YA.Tv0KxvccQnXlGKW.XfsbsTiai y81SrnIkOt7DpQWox Received: from [98.203.118.124] by web121603.mail.ne1.yahoo.com via HTTP; Mon, 02 Sep 2013 05:47:17 PDT X-Rocket-MIMEInfo: 002.001, QXJlIHlvdSB1c2luZyBhIHBjaWUzIGJ1cz8gT2YgY291cnNlIHRoaXMgaXMgb25seSBhbiBpc3N1ZSBmb3IgMTBnOyB3aGF0IHBjdCBvZgpGcmVlQlNEIHVzZXJzIGhhdmUgYSBsb2FkIG92ZXIgOS41R2Ivcz8gSXQncyBjb21wbGV0ZWx5IHVubmVjZXNzYXJ5IGZvciBpZ2IKb3IgZW0gZHJpdmVyLCBzbyB3aHkgaXMgaXQgdXNlZD8gYmVjYXVzZSBpdCdzIHRoZXJlLgoKSGVyZSdzIG15IGFyZ3VtZW50IGFnYWluc3QgaXQuIFRoZSBoYW5kZnVsIG9mIGJyYWlucyBjYXBhYmxlIG9mIGRvaW5nIGRyaXZlciBkZXYBMAEBAQE- X-Mailer: YahooMailWebService/0.8.156.576 References: <521BBD21.4070304@freebsd.org> <521EE8DA.3060107@freebsd.org> <1377952913.44129.YahooMailNeo@web121605.mail.ne1.yahoo.com> <1378001733.36695.YahooMailNeo@web121606.mail.ne1.yahoo.com> <1378050319.62710.YahooMailNeo@web121601.mail.ne1.yahoo.com> Message-ID: <1378126037.56348.YahooMailNeo@web121603.mail.ne1.yahoo.com> Date: Mon, 2 Sep 2013 05:47:17 -0700 (PDT) From: Barney Cordoba Subject: Re: Flow ID, LACP, and igb To: Adrian Chadd In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: Andre Oppermann , Alan Somers , "net@freebsd.org" , Jack F Vogel , "Justin T. Gibbs" , Luigi Rizzo , "T.C. Gubatayao" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Barney Cordoba List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Sep 2013 12:47:25 -0000 Are you using a pcie3 bus? Of course this is only an issue for 10g; what pc= t of=0AFreeBSD users have a load over 9.5Gb/s? It's completely unnecessary = for igb=0Aor em driver, so why is it used? because it's there.=0A=0AHere's = my argument against it. The handful of brains capable of doing driver devel= opment=0Abecome consumed with BS like LRO and the things that need to be fi= xed, like=0Abuffer management and basic driver design flaws, never get fixe= d. The offload=0Acode makes the driver code a virtual mess that can only be= maintained by Jack and=0A1 other guy in the entire world. And it takes 10 = times longer to make a simple change or=0Ato add support for a new NIC.=A0= =0A=0AIn a week I ripped out the offload crap and the 9000 sysctls, elimina= ted the=A0=0A"consumer buffer" problem, reduced locking by 40% and now the = igb driver=0Auses 20% less cpu with a full gig load.=0A=0AAnd the code is c= leaner and more easily maintained.=0A=0ABC=0A=0A=0A________________________= ________=0A From: Adrian Chadd =0ATo: Barney Cordoba =0ACc: Andre Oppermann ; Alan S= omers ; "net@freebsd.org" ; Jack F Vo= gel ; Justin T. Gibbs ; Luigi Rizzo ; T.C. Gubatayao =0ASent: Sunda= y, September 1, 2013 4:51 PM=0ASubject: Re: Flow ID, LACP, and igb=0A =0A= =0AYo,=0A=0ALRO is an interesting hack that seems to do a good trick of hid= ing the=0Aridiculous locking and unfriendly cache behaviour that we do per-= packet.=0A=0AIt helps with LAN test traffic where things are going out in b= atches from=0Athe TCP layer so the RX layer "sees" these frames in-order an= d can do LRO.=0AWhen you disable it, I don't easily get 10GE LAN TCP perfor= mance. That has=0Ato be fixed. Given how fast the CPU cores, bus interconne= ct and memory=0Ainterconnects are, I don't think there should be any reason= why we can't=0Ahit 10GE traffic on a LAN with LRO disabled (in both softwa= re and hardware.)=0A=0ANow that I have the PMC sandy bridge stuff working r= ight (but no PEBS, I=0Ahave to talk to Intel about that in a bit more detai= l before I think about=0Ahacking that in) we can get actual live informatio= n about this stuff. But=0Athe last time I looked, there's just too much per= -packet latency going on.=0AThe root cause looks like it's a toss up betwee= n scheduling, locking and=0Ajust lots of code running to completion per-fra= me. As I said, that all has=0Ato die somehow.=0A=0A2c,=0A=0A=0A=0A-adrian= =0A=0A=0A=0AOn 1 September 2013 08:45, Barney Cordoba wrote:=0A=0A>=0A>=0A> Comcast sends packets OOO. With any decent numb= er of internet hops you're=0A> likely to encounter a load=0A> balancer or p= acket shaper that sends packets OOO, so you just can't be=0A> worried about= it. In fact, your=0A> designs MUST work with OOO packets.=0A>=0A> Getting = balance on your load balanced lines is certainly a bigger upside=0A> than t= he additional CPU used.=0A> You can buy a faster processor for your "stack"= for a lot less than you=0A> can buy bandwidth.=0A>=0A> Frankly my opinion = of LRO is that it's a science project suitable for labs=0A> only. It's a tr= ick to get more bandwidth=0A> than your bus capacity; the answer is to not = run PCIe2 if you need pcie3.=0A> You can use it internally if you have=0A> = control of all of the machines. When I modify a driver the first thing=0A> = that I do is rip it out.=0A>=0A> BC=0A>=0A>=0A> ___________________________= _____=0A>=A0 From: Luigi Rizzo =0A> To: Barney Cordoba = =0A> Cc: Andre Oppermann ; Ala= n Somers ;=0A> "net@freebsd.org" ; Ja= ck F Vogel ;=0A> Justin T. Gibbs ; T.C.= Gubatayao <=0A> tgubatayao@barracuda.com>=0A> Sent: Saturday, August 31, 2= 013 10:27 PM=0A> Subject: Re: Flow ID, LACP, and igb=0A>=0A>=0A> On Sun, Se= p 1, 2013 at 4:15 AM, Barney Cordoba >wrote:= =0A>=0A> > ...=0A> >=0A>=0A> [your point on testing with realistic assumpti= ons is surely a valid one]=0A>=0A>=0A> >=0A> > Of course there's nothing re= ally wrong with OOO packets. We had this=0A> > discussion before; lots of p= eople=0A> > have round robin dual homing without any ill effects. It's just= not an=0A> > issue.=0A> >=0A>=0A> It depends on where you are.=0A> It may = not be an issue if the reordering is not large enough to=0A> trigger retran= smissions, but even then it is annoying as it causes=0A> more work in the e= ndpoint -- it prevents LRO from working, and even=0A> on the host stack it = takes more work to sort where an out of order=0A> segment goes than appendi= ng an in-order one to the socket buffer.=0A>=0A> cheers=0A> luigi=0A> _____= __________________________________________=0A> freebsd-net@freebsd.org mail= ing list=0A> http://lists.freebsd.org/mailman/listinfo/freebsd-net=0A> To u= nsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"=0A> ____= ___________________________________________=0A> freebsd-net@freebsd.org mai= ling list=0A> http://lists.freebsd.org/mailman/listinfo/freebsd-net=0A> To = unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"=0A>=0A_= ______________________________________________=0Afreebsd-net@freebsd.org ma= iling list=0Ahttp://lists.freebsd.org/mailman/listinfo/freebsd-net=0ATo uns= ubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Mon Sep 2 13:01:42 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 76939F19 for ; Mon, 2 Sep 2013 13:01:42 +0000 (UTC) (envelope-from cochard@gmail.com) Received: from mail-vc0-x22a.google.com (mail-vc0-x22a.google.com [IPv6:2607:f8b0:400c:c03::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 310C32DA4 for ; Mon, 2 Sep 2013 13:01:42 +0000 (UTC) Received: by mail-vc0-f170.google.com with SMTP id kw10so3130690vcb.1 for ; Mon, 02 Sep 2013 06:01:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=P9soiYOWUfiJA7hVZIASjyIoEFWYMYH84pdkSMMvMwE=; b=o8mp20tS0DUMKB3HxoBi96y26u+Kz7wORmWXcy8mdEM/ODtAXAPQ3C1hNps+YA59p8 PJaLXRqIbNbnxT99u9k/iuT7iLSA2tyecc29GpOWIl9ZGdaki38Yu5P0yPVfvIQ7MbrJ uPDHfHhEbpmkjJodl3dmXr6eM14HGkNRXL7bYAvAT0f4CVSeO6RtocarOUNe50j0MKTY Wg9X4usknZG+3HmVKC+TDHeFlD6FLxwjCQnF/FETa4mXm50DXp+2t2YJfMBzyHGQBgeK yVTxewFNmy3MRd1cRROSL677oVv3zrDSbV/ah6mjtz17S/Qa17QyQSH1FpOIC4o0NKLk kDWA== X-Received: by 10.52.52.231 with SMTP id w7mr10224301vdo.12.1378126901227; Mon, 02 Sep 2013 06:01:41 -0700 (PDT) MIME-Version: 1.0 Sender: cochard@gmail.com Received: by 10.58.221.9 with HTTP; Mon, 2 Sep 2013 06:01:21 -0700 (PDT) In-Reply-To: <1378126037.56348.YahooMailNeo@web121603.mail.ne1.yahoo.com> References: <521BBD21.4070304@freebsd.org> <521EE8DA.3060107@freebsd.org> <1377952913.44129.YahooMailNeo@web121605.mail.ne1.yahoo.com> <1378001733.36695.YahooMailNeo@web121606.mail.ne1.yahoo.com> <1378050319.62710.YahooMailNeo@web121601.mail.ne1.yahoo.com> <1378126037.56348.YahooMailNeo@web121603.mail.ne1.yahoo.com> From: =?ISO-8859-1?Q?Olivier_Cochard=2DLabb=E9?= Date: Mon, 2 Sep 2013 15:01:21 +0200 X-Google-Sender-Auth: X9i8ieHUeOuPZJPbqCmbry5xRZc Message-ID: Subject: Re: Flow ID, LACP, and igb To: Barney Cordoba Content-Type: text/plain; charset=ISO-8859-1 Cc: "net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Sep 2013 13:01:42 -0000 On Mon, Sep 2, 2013 at 2:47 PM, Barney Cordoba wrote: > > In a week I ripped out the offload crap and the 9000 sysctls, eliminated the > "consumer buffer" problem, reduced locking by 40% and now the igb driver > uses 20% less cpu with a full gig load. > Wow! where is the patch ? I would like to test it too. Thanks, Olivier