From owner-freebsd-net@FreeBSD.ORG Wed Dec 5 13:01:13 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 32AB4955 for ; Wed, 5 Dec 2012 13:01:13 +0000 (UTC) (envelope-from barney_cordoba@yahoo.com) Received: from nm59-vm3.bullet.mail.ne1.yahoo.com (nm59-vm3.bullet.mail.ne1.yahoo.com [98.138.121.127]) by mx1.freebsd.org (Postfix) with ESMTP id 56EF58FC17 for ; Wed, 5 Dec 2012 13:01:12 +0000 (UTC) Received: from [98.138.226.176] by nm59.bullet.mail.ne1.yahoo.com with NNFMP; 05 Dec 2012 12:58:17 -0000 Received: from [98.138.87.6] by tm11.bullet.mail.ne1.yahoo.com with NNFMP; 05 Dec 2012 12:58:17 -0000 Received: from [127.0.0.1] by omp1006.mail.ne1.yahoo.com with NNFMP; 05 Dec 2012 12:58:17 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 728604.25428.bm@omp1006.mail.ne1.yahoo.com Received: (qmail 69087 invoked by uid 60001); 5 Dec 2012 12:58:17 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1354712297; bh=B3h8bir2SgWE4ETUypD5mlhkUVnmkWxWg71yBpDHixE=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=U5kzQd02Dqacf6gEID3KvufkzcxMnHkZCjTYTXWd3yGWcfayqU+nfudArnrFQa8uuAp44ig7J4CQsagW9zO+QYitfX6JK9ggXxljf2BIQUNMeeoEMbmju37T8dH6JPxDgkcZDWFefk3KM6W3cn1uXZ3mMg4GarLsDF0iZr8vFZY= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=vRGINeoDBlhKH3F8ZSQFabVaUhVB4qqJSZgjhw7RcThgmj6ATebFOv1Anl8plOe5sbURVXvZZkVEopFsGFUZVIFl9nshYDmSBVfeCqROp69VSH9JuAMBz0xxiC4iN5vzM+QQtVGtl2CP/kfQXI/DsMlNqkNfAulsxJDKYHrgxVw=; X-YMail-OSG: gYLHBpsVM1kSTeI7EZJA1XwTzbgNTHB43r8PYFfF_2hPTHB 2uUJ2cRGRrIOfxGJO8rdKYlYMvdFqj15NbYWwRkKwbTLVxldISZZ2CL1.EJQ A3o8_XMeGeVmBKFJSUYQal7ECMWXeqT_fGsJczXkaSg_o70qp6WXFuKTJqNV ncSTNL48JptnHk5FGm2LUK1R7FL0ASNM3tvCtWV4nhWPd4YjMsZnCN6aRuj7 rQyCdc3g1SAshcEnwVarNanEqGD5cxCiyPLGgqgYsSECMP5p.U3ew04Vb.re VVSnjtdS5XeBfQLFZ0mMVxoCyXKppjqnD4OpkbBxSEeXMARX32wyCcEHkaoW 0V1eodsaWy_Hn.y9GceWJ4U2lRXO6ZiPFk7bx3xFXhxekJKh_2scsYXDQe4n ELNZ02zbii.k61Dbag.1cfmR1Ei3rFCrNfy1r3TXADqO16s2TWAi4POVSXKS uLLLsEjxAcZ9QBLk5OV.0bifG_kgDtQOfS0xgdluPw5O16qpcTkEkGjc8sBu 3Y7E2KTPG_s0IE24DtXz7huu9sJ49gA-- Received: from [174.48.128.27] by web121606.mail.ne1.yahoo.com via HTTP; Wed, 05 Dec 2012 04:58:17 PST X-Rocket-MIMEInfo: 001.001, CgotLS0gT24gVHVlLCAxMi80LzEyLCBBZHJpYW4gQ2hhZGQgPGFkcmlhbkBmcmVlYnNkLm9yZz4gd3JvdGU6Cgo.IEZyb206IEFkcmlhbiBDaGFkZCA8YWRyaWFuQGZyZWVic2Qub3JnPgo.IFN1YmplY3Q6IFJlOiBMYXRlbmN5IGlzc3VlcyB3aXRoIGJ1Zl9yaW5nCj4gVG86ICJBbmRyZSBPcHBlcm1hbm4iIDxvcHBlcm1hbm5AbmV0d29yeC5jaD4KPiBDYzogIkJhcm5leSBDb3Jkb2JhIiA8YmFybmV5X2NvcmRvYmFAeWFob28uY29tPiwgIkpvaG4gQmFsZHdpbiIgPGpoYkBmcmVlYnNkLm9yZz4sIGZyZWVic2QBMAEBAQE- X-Mailer: YahooMailClassic/15.1.1 YahooMailWebService/0.8.128.478 Message-ID: <1354712297.65896.YahooMailClassic@web121606.mail.ne1.yahoo.com> Date: Wed, 5 Dec 2012 04:58:17 -0800 (PST) From: Barney Cordoba Subject: Re: Latency issues with buf_ring To: Andre Oppermann , Adrian Chadd In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-net@freebsd.org, John Baldwin X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Dec 2012 13:01:13 -0000 =0A=0A--- On Tue, 12/4/12, Adrian Chadd wrote:=0A=0A> = From: Adrian Chadd =0A> Subject: Re: Latency issues wit= h buf_ring=0A> To: "Andre Oppermann" =0A> Cc: "Barney= Cordoba" , "John Baldwin" , fre= ebsd-net@freebsd.org=0A> Date: Tuesday, December 4, 2012, 4:31 PM=0A> On 4 = December 2012 12:02, Andre=0A> Oppermann =0A> wrote:= =0A> =0A> > Our IF_* stack/driver boundary handoff isn't up to the=0A> task= anymore.=0A> =0A> Right. well, the current hand off is really "here's a=0A= > packet, go do=0A> stuff!" and the legacy if_start() method is just plain= =0A> broken for SMP,=0A> preemption and direct dispatch.=0A> =0A> Things ar= e also very special in the net80211 world, with the=0A> stack=0A> layer hav= ing to get its grubby fingers into things.=0A> =0A> I'm sure that the other= examples of layered protocols (eg=0A> doing MPLS,=0A> or even just straigh= t PPPoE style tunneling) has the same=0A> issues.=0A> Anything with sequenc= e numbers and encryption being done by=0A> some other=0A> layer is going to= have the same issue, unless it's all=0A> enforced via=0A> some other queue= and a single thread handling the network=0A> stack=0A> "stuff".=0A> =0A> I= bet direct-dispatch netgraph will have similar issues too,=0A> if it=0A> e= ver comes into existence. :-)=0A> =0A> > Also the interactions are either p= oorly defined or=0A> understood in many=0A> > places.=A0 I've had a few cha= ts with yongari@ and am=0A> experimenting with=0A> > a modernized interface= in my branch.=0A> >=0A> > The reason I stumbled across it was because I'm= =0A> extending the hardware=0A> > offload feature set and found out that th= e stack and=0A> the drivers (and=0A> > the drivers among themself) are not = really in sync with=0A> regards to behavior.=0A> >=0A> > For most if not al= l ethernet drivers from 100Mbit/s the=0A> TX DMA rings=0A> > are so large t= hat buffering at the IFQ level doesn't=0A> make sense anymore=0A> > and onl= y adds latency.=A0 So it could simply=0A> directly put everything into=0A> = > the TX DMA and not even try to soft-queue.=A0 If the=0A> TX DMA ring is f= ull=0A> > ENOBUFS is returned instead of filling yet another=0A> queue.=A0 = However there=0A> > are ALTQ interactions and other mechanisms which have= =0A> to be considered=0A> > too making it a bit more involved.=0A> =0A> net= 80211 has slightly different problems. We have=0A> requirements for=0A> per= -node, per-TID/per-AC state (not just for QOS, but=0A> separate=0A> sequenc= e numbers, different state machine handling for=0A> things like=0A> aggrega= tion and (later) U-APSD handling, etc) so we do need=0A> to direct=0A> fram= es into different queues and then correctly serialise=0A> that mess.=0A> = =0A> > I'm coming up with a draft and some benchmark results=0A> for an upd= ated=0A> > stack/driver boundary in the next weeks before xmas.=0A> =0A> Ok= . Please don't rush into it though; I'd like time to think=0A> about it=0A>= after NY (as I may actually _have_ a holiday this xmas!) and=0A> I'd like= =0A> to try and rope in people from non-ethernet-packet-pushing=0A> backgro= unds=0A> to comment.=0A> They may have much stricter and/or stranger requir= ements=0A> when it comes=0A> to how the network layer passes, serialises an= d pushes=0A> packets to=0A> other layers.=0A> =0A> Thanks,=0A> =0A> =0A> Ad= rian=0A=0ASomething I'd like to see is a general modularization of function= ,=0Awhich will make all of the other stuff much easier. A big issue with=0A= multipurpose OSes is that they tend to be bloated with stuff that almost=0A= nobody uses. 99.9% of people are running either bridge/filters or straight= =0ATCP/IP, and there is a different design goal for a single nic web server= =0Aand a router or firewall. =0A=0ABy modularization, I mean making the "pi= eces" threadable. The requirements=0Afor threading vary by application, but= the ability to control it can=0Amake a world of difference in performance.= Having a dedicate transmit=0Athread may make no sense on a web server, on = a dual core system or=0Awith a single queue adapter, but other times it mig= ht. Instead of having=0Aone big honking routine that does everything, modul= arizing it not only=0Acleans up the code, but also makes the system more fl= exible without =0Amaking it a mess.=0A=0AThe design for the 99% should not = be hindered by the need to support =0Astuff like ALTQ. The hooks for ALTQ s= hould be possible, but the locking=0Aand queuing only required for such out= liers should be separable. =0A=0AI'd also like to see a unification of all = of the projects. Is it really=0Anecessary to have 34 checks for different "= ideas" in if_ethersubr.c? =0A=0AAs a developer I know that you always want = to work on the next new thing,=0Abut sometimes you need to stop, think, and= clean up your code. The cleaner=0Acode opens up new possibilities, and res= ults in a better overall product.=0A=0ABC