From owner-freebsd-net@FreeBSD.ORG Fri Jun 19 14:56:02 2009 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6803B1065688 for ; Fri, 19 Jun 2009 14:56:02 +0000 (UTC) (envelope-from barney_cordoba@yahoo.com) Received: from web63904.mail.re1.yahoo.com (web63904.mail.re1.yahoo.com [69.147.97.119]) by mx1.freebsd.org (Postfix) with SMTP id EC3E48FC08 for ; Fri, 19 Jun 2009 14:56:01 +0000 (UTC) (envelope-from barney_cordoba@yahoo.com) Received: (qmail 18682 invoked by uid 60001); 19 Jun 2009 14:56:01 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1245423361; bh=HIjBsBlWA083HrJ6pYHe5T5VUqMyx11A+b1Cju7fSu4=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=5S01+L3WDBiNOr4jIaYeXPbPjafd/x2nrQZj/JIhWQdRYudjB/sgcCWrm4Mn26Z89/Rkfn/RgS3FkJu43E0sejnFU5AIwqQIcGCf7MSC2jP040eKOhZv463zUTiJTjnJkp8ES+7UA6ltjKmgcoo27pdzlSxm72SvcAGJ8Sf7wRg= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=3DZxNQpg9RgHacSXg7FqWkhKz7rXb5I+kl66rioddjuF59Zu9WvTRvHmnLAvu94+d6APQhvMWkueHEAVHa6N0Bf+bG9y77b4EsmQeqjUJDLTKEW0+dgWACxJafwx+s+l4pS8ZV8aQzOScl/wRd02K23dizNAkgMFbpajwDEkz78=; Message-ID: <436489.18496.qm@web63904.mail.re1.yahoo.com> X-YMail-OSG: PYmr2ZIVM1lsel7fvE8OIyoBBDPt3DZ3Ki4oymK6PtasvK50iy2XwEVcxhzmM2.R6W.vWAFSkKK4obOeGcZT0X1flLGqb5I9Ty_KRxgrDL59v7_E8Sww1w0PH5_DnOo9pIiGRQDogv7WbisW0A7j.jcRAd_CuJVujhGvj_U2Q9Y7dvCZiI.eSbGYSqVdqbg.neOcEKl2B46cesDfE1aj40niJ9oaZbtfC7qgP9YFntGUVzgAwwQ9JN8bw__g3qgZVYcecPmIjOFftMo93Ogwm86_BTjrjcPBW04pqzJwDBhglBgNWCcvubq1vRL1h4Jc2bTLESqRXdzMT8bjMWa6lvs- Received: from [66.176.162.245] by web63904.mail.re1.yahoo.com via HTTP; Fri, 19 Jun 2009 07:56:00 PDT X-Mailer: YahooMailClassic/5.4.17 YahooMailWebService/0.7.289.15 Date: Fri, 19 Jun 2009 07:56:00 -0700 (PDT) From: Barney Cordoba To: freebsd-net@FreeBSD.org, Michael MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Subject: Re: kern/135222: [igb] low speed routing between two igb interfaces X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Jun 2009 14:56:02 -0000 =0A=0A--- On Wed, 6/17/09, Michael wrote:=0A=0A> Fr= om: Michael =0A> Subject: Re: kern/135222: [igb] low= speed routing between two igb interfaces=0A> To: freebsd-net@FreeBSD.org= =0A> Date: Wednesday, June 17, 2009, 9:40 PM=0A> The following reply was ma= de to PR=0A> kern/135222; it has been noted by GNATS.=0A> =0A> From: Michae= l =0A> To: Barney Cordoba = =0A> Cc: freebsd-gnats-submit@FreeBSD.org=0A> Subject: Re: kern/135222: [ig= b] low speed routing between=0A> two igb interfaces=0A> Date: Thu, 18 Jun 2= 009 03:32:15 +0200=0A> =0A> Barney Cordoba wrote:=0A> > =0A> > =0A> > -= -- On Wed, 6/17/09, Michael =0A> wrote:=0A> > =0A> = >> From: Michael =0A> >> Subject: Re: kern/135222:= [igb] low speed routing=0A> between two igb interfaces=0A> >> To: "Barney= Cordoba" =0A> >> Cc: freebsd-net@FreeBSD.org=0A= > >> Date: Wednesday, June 17, 2009, 5:28 PM=0A> >> Barney Cordoba wrote:= =0A> >>>=0A> >>> --- On Fri, 6/12/09, Michael =0A>= >> wrote:=0A> >>>> From: Michael =0A> >>>> Subje= ct: Re: kern/135222: [igb] low speed=0A> routing=0A> >> between two igb in= terfaces=0A> >>>> To: freebsd-net@FreeBSD.org=0A> >>>> Date: Friday, June= 12, 2009, 5:50 AM=0A> >>>> The following reply was made to PR=0A> >>>> k= ern/135222; it has been noted by GNATS.=0A> >>>>=0A> >>>> From: Michael <= freebsdusb@bindone.de>=0A> >>>> To: Cc: freebsd-gnats-submit@FreeBSD.org= =0A> >>>> Subject: Re: kern/135222: [igb] low speed=0A> routing=0A> >> be= tween=0A> >>>> two igb interfaces=0A> >>>> Date: Fri, 12 Jun 2009 11:45:4= 7 +0200=0A> >>>>=0A> >>>>=A0=A0=A0The original poster=0A> reported that t= he=0A> >> suggested fix works=0A> >>>> for him:=0A> >>>>=A0=A0=A0---=0A>= >>>>=A0=A0=A0Hello Michael,=0A> >>>>=A0=A0=A0=0A> >>>>=A0=A0=A0Thank yo= u. It's=0A> working.=0A> >>>>=A0=A0=A0=0A> >>>>=A0=A0=A0I consider it nec= essary=0A> to put this into the=0A> >> release=0A> >>>> errata.=0A> >>>>= =A0=A0=A0=0A> >>>>=A0=A0=A0=0A> >>>>=A0=A0=A0Mishustin Andrew wrote:=0A> = >>>>=A0=A0=A0>> Number:=A0=0A> =A0=A0=A0=0A> >>>>=A0 =A0=A0=A0135222=0A> = >>>>=A0=A0=A0>>=0A> Category:=A0=A0=A0=0A> >>=A0 =A0 kern=0A> >>>>=A0=A0= =A0>>=0A> Synopsis:=A0=A0=A0=0A> >>=A0 =A0 [igb]=0A> >>>> low speed routi= ng between two igb=0A> interfaces=0A> >>>>=A0=A0=A0>>=0A> Confidential:=A0= =A0=A0no=0A> >>>>=A0=A0=A0>>=0A> Severity:=A0=A0=A0=0A> >>=A0 =A0 serious= =0A> >>>>=A0=A0=A0>>=0A> Priority:=A0=A0=A0=0A> >>=A0 =A0 medium=0A> >>>= >=A0=A0=A0>>=0A> Responsible:=A0=A0=A0=0A> >> freebsd-bugs=0A> >>>>=A0=A0= =A0>> State:=A0=0A> =A0 =A0=A0=A0=0A> >>=A0=A0=A0open=0A> >>>>=A0=A0=A0>>= Quarter:=A0=0A> =A0 =A0=A0=A0=0A> >>>>=A0=A0=A0>>=0A> Keywords:=A0=A0=A0= =0A> >>=A0 =A0 =0A> >>>>=A0=A0=A0>> Date-Required:=0A> >>>>=A0=A0=A0>> C= lass:=A0=0A> =A0 =A0=A0=A0=0A> >>=A0=A0=A0sw-bug=0A> >>>>=A0=A0=A0>>=0A> = >> Submitter-Id:=A0=A0=A0current-users=0A> >>>>=A0=A0=A0>>=0A> Arrival-Da= te:=A0=A0=A0Wed=0A> >> Jun 03=0A> >>>> 18:30:01 UTC 2009=0A> >>>>=A0=A0= =A0>> Closed-Date:=0A> >>>>=A0=A0=A0>> Last-Modified:=0A> >>>>=A0=A0=A0>>= Originator: =0A> >>=A0 =A0 Mishustin=0A> >>>> Andrew=0A> >>>>=A0=A0=A0>= > Release:=A0=0A> =A0 =A0=A0=A0=0A> >> FreeBSD=0A> >>>> 7.1-RELEASE amd64= , FreeBSD 7.2-RELEASE=0A> amd64=0A> >>>>=A0=A0=A0>> Organization:=0A> >>>= >=A0=A0=A0> HNT=0A> >>>>=A0=A0=A0>> Environment:=0A> >>>>=A0=A0=A0> FreeB= SD test.hnt=0A> 7.2-RELEASE FreeBSD=0A> >> 7.2-RELEASE #12:=0A> >>>> Thu = Apr 30 18:28:15 MSD 20=0A> >>>>=A0=A0=A0> 09=A0=0A> =A0=A0=A0admin@test.hn= t:/usr/src/sys/amd64/compile/GENERIC=0A> >>>> amd64=0A> >>>>=A0=A0=A0>> D= escription:=0A> >>>>=A0=A0=A0> I made a FreeBSD=0A> multiprocesor server= =0A> >> to act as=0A> >>>> simple gateway.=0A> >>>>=A0=A0=A0> It use onb= oard=0A> Intel 82575EB Dual-Port=0A> >> Gigabit=0A> >>>> Ethernet Control= ler.=0A> >>>>=A0=A0=A0> I observe traffic=0A> speed near 400=0A> >> Kbit/= s.=0A> >>>>=A0=A0=A0> I test both=0A> interfaces separately -=0A> >>>>=A0= =A0=A0> ftp client work at=0A> speed near 1 Gbit/s=0A> >> in both=0A> >>>= > directions.=0A> >>>>=A0=A0=A0> Then I change NIC=0A> to old Intel "em" N= IC=0A> >> - gateway=0A> >>>> work at speed near 1 Gbit/s.=0A> >>>>=A0=A0= =A0> =0A> >>>>=A0=A0=A0> Looks like a bug in=0A> igb driver have an=0A> >= > effect upon=0A> >>>> forwarded traffic.=0A> >>>>=A0=A0=A0> =0A> >>>>= =A0=A0=A0> If you try=0A> >>>>=A0=A0=A0>=0A> hw.igb.enable_aim=3D0=0A> >>= >>=A0=A0=A0> The speed is near 1=0A> Mbit/s=0A> >>>>=A0=A0=A0> =0A> >>>>= =A0=A0=A0> hw.igb.rxd,=0A> hw.igb.txd, "ifconfig -tso"=0A> >> has no=0A> = >>>> effect.=0A> >>>>=A0=A0=A0> =0A> >>>>=A0=A0=A0> Nothing in=0A> messag= es.log=0A> >>>>=A0=A0=A0> =0A> >>>>=A0=A0=A0> netstat -m=0A> >>>>=A0=A0= =A0> 516/1674/2190 mbufs=0A> in use=0A> >> (current/cache/total)=0A> >>>>= =A0=A0=A0> 515/927/1442/66560=0A> mbuf clusters in=0A> >> use=0A> >>>> (c= urrent/cache/total/max)=0A> >>>>=A0=A0=A0> 515/893=0A> mbuf+clusters out o= f packet=0A> >> secondary zone in=0A> >>>> use (current/cache)=0A> >>>>= =A0=A0=A0> 0/44/44/33280 4k=0A> (page size) jumbo=0A> >> clusters in use= =0A> >>>> (current/cache/total/max)=0A> >>>>=A0=A0=A0> 0/0/0/16640 9k=0A>= jumbo clusters in use=0A> >>>> (current/cache/total/max)=0A> >>>>=A0=A0= =A0> 0/0/0/8320 16k=0A> jumbo clusters in use=0A> >>>> (current/cache/tota= l/max)=0A> >>>>=A0=A0=A0> 1159K/2448K/3607K=0A> bytes allocated to=0A> >>= network=0A> >>>> (current/cache/total)=0A> >>>>=A0=A0=A0> 0/0/0 requests= for=0A> mbufs denied=0A> >>>> (mbufs/clusters/mbuf+clusters)=0A> >>>>=A0= =A0=A0> 0/0/0 requests for=0A> jumbo clusters=0A> >> denied (4k/9k/16k)=0A= > >>>>=A0=A0=A0> 0/0/0 sfbufs in use=0A> (current/peak/max)=0A> >>>>=A0= =A0=A0> 0 requests for=0A> sfbufs denied=0A> >>>>=A0=A0=A0> 0 requests for= =0A> sfbufs delayed=0A> >>>>=A0=A0=A0> 0 requests for I/O=0A> initiated by= =0A> >> sendfile=0A> >>>>=A0=A0=A0> 0 calls to protocol=0A> drain routine= s=0A> >>>>=A0=A0=A0> =0A> >>>>=A0=A0=A0> I use only IPv4=0A> traffic.=0A>= >>>>=A0=A0=A0> =0A> >>>>=A0=A0=A0>> How-To-Repeat:=0A> >>>>=A0=A0=A0> O= n machine with two=0A> igb interfaces=0A> >>>>=A0=A0=A0> use rc.conf like= =0A> this:=0A> >>>>=A0=A0=A0> =0A> >>>>=A0=A0=A0>=0A> hostname=3D"test.te= st"=0A> >>>>=A0=A0=A0>=0A> gateway_enable=3D"YES"=0A> >>>>=A0=A0=A0> ifco= nfig_igb0=3D"inet=0A> 10.10.10.1/24"=0A> >>>>=A0=A0=A0> ifconfig_igb1=3D"i= net=0A> 10.10.11.1/24"=0A> >>>>=A0=A0=A0> =0A> >>>>=A0=A0=A0> And try cre= ate=0A> heavy traffic between=0A> >> two networks.=0A> >>>>=A0=A0=A0>> Fi= x:=0A> >>>>=A0=A0=A0> =0A> >>>>=A0=A0=A0> =0A> >>>>=A0=A0=A0>> Release-N= ote:=0A> >>>>=A0=A0=A0>> Audit-Trail:=0A> >>>>=A0=A0=A0>> Unformatted:=0A= > >>>>=A0=A0=A0>=0A> >> _______________________________________________= =0A> >>>>=A0=A0=A0> freebsd-bugs@freebsd.org=0A> >>>=0A> >>> This is not= a bug. Unless you consider poorly=0A> written=0A> >> drivers to be bugs. = You need to provide your=0A> tuning=0A> >> parameters for the card as well= otherwise there's=0A> nothing to=0A> >> learn.=0A> >>> The issue is that= the driver doesn't address=0A> the=0A> >> purpose of the controller; whic= h is to utilize=0A> >> multiprocessor systems more effectively. The=0A> ef= fect is that=0A> >> lock contention actually makes things worse than=0A> i= f you just=0A> >> use a single task as em does. Until the=0A> multiqueue d= rivers=0A> >> are re-written to manage locks properly you are=0A> best adv= ised=0A> >> to save your money and stick with em.=0A> >>> You should get = similar performance using 1=0A> queue as=0A> >> with em. You could also fo= rce legacy=0A> configuration by=0A> >> forcing igb_setup_msix to return 0.= Sadly, this=0A> is the best=0A> >> performance you will get from the stoc= k driver.=0A> >>> Barney=0A> >>>=0A> >>> Barney=0A> >>>=0A> >>>=0A> >= >>=A0 =A0 =A0 =A0 =0A> >> I tried using 1 queue and it didn't make things= =0A> any better=0A> >> (actually I'm=0A> >> not sure if that worked at al= l). If it is=0A> considered a bug=0A> >> or not=0A> >> doesn't really mat= ter, what actually matters for=0A> users (who=0A> >> cannot=0A> >> always= chose which network controller will be=0A> on-board) is=0A> >> that they = get=0A> >> a least decent performance when doing IP=0A> forwarding (and=0A= > >> not the=0A> >> 5-50kb/s I've seen). You can get this out of the=0A> = >> controller, when=0A> >> disabling lro through the sysctl. That's why I= 've=0A> been=0A> >> asking to put=0A> >> this into the release errata sec= tion and/or at=0A> least the=0A> >> igb man page,=0A> >> because the sysc= tl isn't documented anywhere.=0A> Also the=0A> >> fact, that tuning=0A> >= > the sysctl only affects the behaviour when it's=0A> set on boot=0A> >> m= ight be=0A> >> considered problematic.=0A> >>=0A> >> So at the very leas= t, I think the following=0A> should be=0A> >> done:=0A> >> 1. Document th= e sysctl in man igb(4)=0A> >> 2. Put a known issues paragraph to man igb(4= )=0A> which=0A> >> explains the issue=0A> >> and what to put in sysctl.co= nf to stop this from=0A> happening=0A> >> 3. Add an entry to the release e= rrata page about=0A> this issue=0A> >> (like I=0A> >> suggested in one of= my earlier emails) and=0A> stating=0A> >> something like "see=0A> >> man= igb(4) for details)=0A> >>=0A> >> This is not about using the controller= to its=0A> full=0A> >> potential, but to=0A> >> safe Joe Admin from spen= ding days on figuring out=0A> why the=0A> >> machine is=0A> >> forwarding= packages slower than his BSD 2.x=0A> machine did in=0A> >> the 90s.=0A> = >>=0A> >> cheers=0A> >> Michael=0A> > =0A> > None of the offload crap s= hould be enabled by=0A> default. =0A> > =0A> > The real point is that "Jo= e Admin" shouldn't be using=0A> controllers that have bad drivers at all. I= f you have to use=0A> whatever hardware you have laying around, and don't h= ave=0A> enough flexibility to lay out $100 for a 2 port controller=0A> that= works to use with your $2000 server, than you need to=0A> get your priorit= ies in order. People go out and buy=0A> redundant power supplies, high GHZ = quad core processors and=0A> gobs of memory and then they use whatever crap= py onboard=0A> controller they get no matter how poorly its suppo rted. Its= =0A> mindless.=0A> > =0A> > Barney=0A> > =0A> > =0A> >=A0 =A0 =A0=A0= =A0=0A> =0A> How should anybody know that the controller is poorly=0A> su= pported if there=0A> is nothing in the documentation, release notes, man p= ages=0A> or anywhere=0A> else about this?=0A> =0A> The fact of the matte= r is that "the offload crap" _is_=0A> enabled by=0A> default. The release = is out, it claims to support the=0A> controller. There=0A> _is_ a workarou= nd and I'm asked if somebody could document=0A> this so users=0A> will hav= e a chance. I'm also not convinced that it is a=0A> crappy=0A> controller = per se, but just poorly supported. We used=0A> those a lot before=0A> with= out any issues, unfortunately now we had touse IP=0A> forwarding in a=0A> = machine that has that controller (it has 6 interfaces in=0A> total, four em= =0A> ports and two igb ports, all of them are in use and I=0A> don't feel = like=0A> hooking up the sodering iron).=0A> =0A> So bottomline:=0A> I s= aid, there is a problem with the driver, there is a=0A> workaround and it= =0A> should be documented.=0A> =0A> You say, the driver is bad and nobod= y should use it and if=0A> they do it's=0A> their own damn fault. We won't= do anything about it and=0A> refuse to tell=0A> anybody, because we are t= he only ones who should know. We=0A> don't care if=0A> people can actually= use our software and still claim the=0A> hardware is=0A> actually support= ed.=0A> =0A> Your attitude is really contra productive (actually=0A> goog= ling around I=0A> see=A0 you made similar statements in the past about=0A>= stupid people not=0A> willing to spend xxx$ on whatever piece of hardware= , so=0A> maybe you're=0A> just trolling).=0A> =0A> Michael=0A=0ATuning t= he card to be brain-dead isn't really a workaround. I'm sorry that you're n= ot able to understand, but you can't educate the woodchucks, so carry on an= d feel free to do whatever you wish.=0A=0ABC=0A=0A=0A