Date: Mon, 2 Sep 2013 05:47:17 -0700 (PDT) From: Barney Cordoba <barney_cordoba@yahoo.com> To: Adrian Chadd <adrian@freebsd.org> Cc: Andre Oppermann <andre@freebsd.org>, Alan Somers <asomers@freebsd.org>, "net@freebsd.org" <net@freebsd.org>, Jack F Vogel <jfv@freebsd.org>, "Justin T. Gibbs" <gibbs@freebsd.org>, Luigi Rizzo <rizzo@iet.unipi.it>, "T.C. Gubatayao" <tgubatayao@barracuda.com> Subject: Re: Flow ID, LACP, and igb Message-ID: <1378126037.56348.YahooMailNeo@web121603.mail.ne1.yahoo.com> In-Reply-To: <CAJ-VmomEKxJ5zz3Gw1b-HizDJ03_Mn=6uZVYR07QFTqwBzNsCg@mail.gmail.com> References: <D01A0CB2-B1E3-4F4B-97FA-4C821C0E3FD2@FreeBSD.org> <521BBD21.4070304@freebsd.org> <CAOtMX2jvKGY==t9i-a_8RtMAPH2p1VDj950nMHHouryoz3nbsA@mail.gmail.com> <521EE8DA.3060107@freebsd.org> <BCC2C62D4FE171479E2F1C2593FE508B0BE24383@BN-SCL-MBX03.Cudanet.local> <CAOtMX2h5SGh5eYV50y%2BQB_s367V9iattGU862wwXcONDV%2BTG8g@mail.gmail.com> <CA%2BhQ2%2BhgTaK1ZCOLGVFjSPY8nyNPHK4waSecyRQxR1gQcyjztg@mail.gmail.com> <1377952913.44129.YahooMailNeo@web121605.mail.ne1.yahoo.com> <BCC2C62D4FE171479E2F1C2593FE508B0BE2440B@BN-SCL-MBX03.Cudanet.local> <1378001733.36695.YahooMailNeo@web121606.mail.ne1.yahoo.com> <CA%2BhQ2%2Bj-DDuEX1KCDYioCactjL71p-d4AtusPUfePrswDyUpog@mail.gmail.com> <1378050319.62710.YahooMailNeo@web121601.mail.ne1.yahoo.com> <CAJ-VmomEKxJ5zz3Gw1b-HizDJ03_Mn=6uZVYR07QFTqwBzNsCg@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Are you using a pcie3 bus? Of course this is only an issue for 10g; what pc= t of=0AFreeBSD users have a load over 9.5Gb/s? It's completely unnecessary = for igb=0Aor em driver, so why is it used? because it's there.=0A=0AHere's = my argument against it. The handful of brains capable of doing driver devel= opment=0Abecome consumed with BS like LRO and the things that need to be fi= xed, like=0Abuffer management and basic driver design flaws, never get fixe= d. The offload=0Acode makes the driver code a virtual mess that can only be= maintained by Jack and=0A1 other guy in the entire world. And it takes 10 = times longer to make a simple change or=0Ato add support for a new NIC.=A0= =0A=0AIn a week I ripped out the offload crap and the 9000 sysctls, elimina= ted the=A0=0A"consumer buffer" problem, reduced locking by 40% and now the = igb driver=0Auses 20% less cpu with a full gig load.=0A=0AAnd the code is c= leaner and more easily maintained.=0A=0ABC=0A=0A=0A________________________= ________=0A From: Adrian Chadd <adrian@freebsd.org>=0ATo: Barney Cordoba <b= arney_cordoba@yahoo.com> =0ACc: Andre Oppermann <andre@freebsd.org>; Alan S= omers <asomers@freebsd.org>; "net@freebsd.org" <net@freebsd.org>; Jack F Vo= gel <jfv@freebsd.org>; Justin T. Gibbs <gibbs@freebsd.org>; Luigi Rizzo <ri= zzo@iet.unipi.it>; T.C. Gubatayao <tgubatayao@barracuda.com> =0ASent: Sunda= y, September 1, 2013 4:51 PM=0ASubject: Re: Flow ID, LACP, and igb=0A =0A= =0AYo,=0A=0ALRO is an interesting hack that seems to do a good trick of hid= ing the=0Aridiculous locking and unfriendly cache behaviour that we do per-= packet.=0A=0AIt helps with LAN test traffic where things are going out in b= atches from=0Athe TCP layer so the RX layer "sees" these frames in-order an= d can do LRO.=0AWhen you disable it, I don't easily get 10GE LAN TCP perfor= mance. That has=0Ato be fixed. Given how fast the CPU cores, bus interconne= ct and memory=0Ainterconnects are, I don't think there should be any reason= why we can't=0Ahit 10GE traffic on a LAN with LRO disabled (in both softwa= re and hardware.)=0A=0ANow that I have the PMC sandy bridge stuff working r= ight (but no PEBS, I=0Ahave to talk to Intel about that in a bit more detai= l before I think about=0Ahacking that in) we can get actual live informatio= n about this stuff. But=0Athe last time I looked, there's just too much per= -packet latency going on.=0AThe root cause looks like it's a toss up betwee= n scheduling, locking and=0Ajust lots of code running to completion per-fra= me. As I said, that all has=0Ato die somehow.=0A=0A2c,=0A=0A=0A=0A-adrian= =0A=0A=0A=0AOn 1 September 2013 08:45, Barney Cordoba <barney_cordoba@yahoo= .com> wrote:=0A=0A>=0A>=0A> Comcast sends packets OOO. With any decent numb= er of internet hops you're=0A> likely to encounter a load=0A> balancer or p= acket shaper that sends packets OOO, so you just can't be=0A> worried about= it. In fact, your=0A> designs MUST work with OOO packets.=0A>=0A> Getting = balance on your load balanced lines is certainly a bigger upside=0A> than t= he additional CPU used.=0A> You can buy a faster processor for your "stack"= for a lot less than you=0A> can buy bandwidth.=0A>=0A> Frankly my opinion = of LRO is that it's a science project suitable for labs=0A> only. It's a tr= ick to get more bandwidth=0A> than your bus capacity; the answer is to not = run PCIe2 if you need pcie3.=0A> You can use it internally if you have=0A> = control of all of the machines. When I modify a driver the first thing=0A> = that I do is rip it out.=0A>=0A> BC=0A>=0A>=0A> ___________________________= _____=0A>=A0 From: Luigi Rizzo <rizzo@iet.unipi.it>=0A> To: Barney Cordoba = <barney_cordoba@yahoo.com>=0A> Cc: Andre Oppermann <andre@freebsd.org>; Ala= n Somers <asomers@freebsd.org>;=0A> "net@freebsd.org" <net@freebsd.org>; Ja= ck F Vogel <jfv@freebsd.org>;=0A> Justin T. Gibbs <gibbs@freebsd.org>; T.C.= Gubatayao <=0A> tgubatayao@barracuda.com>=0A> Sent: Saturday, August 31, 2= 013 10:27 PM=0A> Subject: Re: Flow ID, LACP, and igb=0A>=0A>=0A> On Sun, Se= p 1, 2013 at 4:15 AM, Barney Cordoba <barney_cordoba@yahoo.com=0A> >wrote:= =0A>=0A> > ...=0A> >=0A>=0A> [your point on testing with realistic assumpti= ons is surely a valid one]=0A>=0A>=0A> >=0A> > Of course there's nothing re= ally wrong with OOO packets. We had this=0A> > discussion before; lots of p= eople=0A> > have round robin dual homing without any ill effects. It's just= not an=0A> > issue.=0A> >=0A>=0A> It depends on where you are.=0A> It may = not be an issue if the reordering is not large enough to=0A> trigger retran= smissions, but even then it is annoying as it causes=0A> more work in the e= ndpoint -- it prevents LRO from working, and even=0A> on the host stack it = takes more work to sort where an out of order=0A> segment goes than appendi= ng an in-order one to the socket buffer.=0A>=0A> cheers=0A> luigi=0A> _____= __________________________________________=0A> freebsd-net@freebsd.org mail= ing list=0A> http://lists.freebsd.org/mailman/listinfo/freebsd-net=0A> To u= nsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"=0A> ____= ___________________________________________=0A> freebsd-net@freebsd.org mai= ling list=0A> http://lists.freebsd.org/mailman/listinfo/freebsd-net=0A> To = unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"=0A>=0A_= ______________________________________________=0Afreebsd-net@freebsd.org ma= iling list=0Ahttp://lists.freebsd.org/mailman/listinfo/freebsd-net=0ATo uns= ubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Mon Sep 2 13:01:42 2013 Return-Path: <owner-freebsd-net@FreeBSD.ORG> Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 76939F19 for <net@freebsd.org>; Mon, 2 Sep 2013 13:01:42 +0000 (UTC) (envelope-from cochard@gmail.com) Received: from mail-vc0-x22a.google.com (mail-vc0-x22a.google.com [IPv6:2607:f8b0:400c:c03::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 310C32DA4 for <net@freebsd.org>; Mon, 2 Sep 2013 13:01:42 +0000 (UTC) Received: by mail-vc0-f170.google.com with SMTP id kw10so3130690vcb.1 for <net@freebsd.org>; Mon, 02 Sep 2013 06:01:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=P9soiYOWUfiJA7hVZIASjyIoEFWYMYH84pdkSMMvMwE=; b=o8mp20tS0DUMKB3HxoBi96y26u+Kz7wORmWXcy8mdEM/ODtAXAPQ3C1hNps+YA59p8 PJaLXRqIbNbnxT99u9k/iuT7iLSA2tyecc29GpOWIl9ZGdaki38Yu5P0yPVfvIQ7MbrJ uPDHfHhEbpmkjJodl3dmXr6eM14HGkNRXL7bYAvAT0f4CVSeO6RtocarOUNe50j0MKTY Wg9X4usknZG+3HmVKC+TDHeFlD6FLxwjCQnF/FETa4mXm50DXp+2t2YJfMBzyHGQBgeK yVTxewFNmy3MRd1cRROSL677oVv3zrDSbV/ah6mjtz17S/Qa17QyQSH1FpOIC4o0NKLk kDWA== X-Received: by 10.52.52.231 with SMTP id w7mr10224301vdo.12.1378126901227; Mon, 02 Sep 2013 06:01:41 -0700 (PDT) MIME-Version: 1.0 Sender: cochard@gmail.com Received: by 10.58.221.9 with HTTP; Mon, 2 Sep 2013 06:01:21 -0700 (PDT) In-Reply-To: <1378126037.56348.YahooMailNeo@web121603.mail.ne1.yahoo.com> References: <D01A0CB2-B1E3-4F4B-97FA-4C821C0E3FD2@FreeBSD.org> <521BBD21.4070304@freebsd.org> <CAOtMX2jvKGY==t9i-a_8RtMAPH2p1VDj950nMHHouryoz3nbsA@mail.gmail.com> <521EE8DA.3060107@freebsd.org> <BCC2C62D4FE171479E2F1C2593FE508B0BE24383@BN-SCL-MBX03.Cudanet.local> <CAOtMX2h5SGh5eYV50y+QB_s367V9iattGU862wwXcONDV+TG8g@mail.gmail.com> <CA+hQ2+hgTaK1ZCOLGVFjSPY8nyNPHK4waSecyRQxR1gQcyjztg@mail.gmail.com> <1377952913.44129.YahooMailNeo@web121605.mail.ne1.yahoo.com> <BCC2C62D4FE171479E2F1C2593FE508B0BE2440B@BN-SCL-MBX03.Cudanet.local> <1378001733.36695.YahooMailNeo@web121606.mail.ne1.yahoo.com> <CA+hQ2+j-DDuEX1KCDYioCactjL71p-d4AtusPUfePrswDyUpog@mail.gmail.com> <1378050319.62710.YahooMailNeo@web121601.mail.ne1.yahoo.com> <CAJ-VmomEKxJ5zz3Gw1b-HizDJ03_Mn=6uZVYR07QFTqwBzNsCg@mail.gmail.com> <1378126037.56348.YahooMailNeo@web121603.mail.ne1.yahoo.com> From: =?ISO-8859-1?Q?Olivier_Cochard=2DLabb=E9?= <olivier@cochard.me> Date: Mon, 2 Sep 2013 15:01:21 +0200 X-Google-Sender-Auth: X9i8ieHUeOuPZJPbqCmbry5xRZc Message-ID: <CA+q+TcoxWLqQCh=MjB9UDkbCia0+dTkCQKnNY8K6c7HH_eqkpw@mail.gmail.com> Subject: Re: Flow ID, LACP, and igb To: Barney Cordoba <barney_cordoba@yahoo.com> Content-Type: text/plain; charset=ISO-8859-1 Cc: "net@freebsd.org" <net@freebsd.org> X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org> List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>, <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe> List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net> List-Post: <mailto:freebsd-net@freebsd.org> List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help> List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>, <mailto:freebsd-net-request@freebsd.org?subject=subscribe> X-List-Received-Date: Mon, 02 Sep 2013 13:01:42 -0000 On Mon, Sep 2, 2013 at 2:47 PM, Barney Cordoba <barney_cordoba@yahoo.com> wrote: > > In a week I ripped out the offload crap and the 9000 sysctls, eliminated the > "consumer buffer" problem, reduced locking by 40% and now the igb driver > uses 20% less cpu with a full gig load. > Wow! where is the patch ? I would like to test it too. Thanks, Olivier
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1378126037.56348.YahooMailNeo>