Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 6 Dec 2012 10:02:03 -0800 (PST)
From:      Barney Cordoba <barney_cordoba@yahoo.com>
To:        Robert Watson <rwatson@FreeBSD.org>
Cc:        freebsd-net@freebsd.org
Subject:   Re: Latency issues with buf_ring
Message-ID:  <1354816923.71234.YahooMailClassic@web121601.mail.ne1.yahoo.com>
In-Reply-To: <alpine.BSF.2.00.1212060936010.78351@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help
=0A=0A--- On Thu, 12/6/12, Robert Watson <rwatson@FreeBSD.org> wrote:=0A=0A=
> From: Robert Watson <rwatson@FreeBSD.org>=0A> Subject: Re: Latency issues=
 with buf_ring=0A> To: "Andre Oppermann" <oppermann@networx.ch>=0A> Cc: "Ba=
rney Cordoba" <barney_cordoba@yahoo.com>, "Adrian Chadd" <adrian@freebsd.or=
g>, "John Baldwin" <jhb@freebsd.org>, freebsd-net@freebsd.org=0A> Date: Thu=
rsday, December 6, 2012, 4:39 AM=0A> On Tue, 4 Dec 2012, Andre Oppermann=0A=
> wrote:=0A> =0A> > For most if not all ethernet drivers from 100Mbit/s the=
=0A> TX DMA rings are so large that buffering at the IFQ level=0A> doesn't =
make sense anymore and only adds latency.=A0 So=0A> it could simply directl=
y put everything into the TX DMA and=0A> not even try to soft-queue.=A0 If =
the TX DMA ring is full=0A> ENOBUFS is returned instead of filling yet anot=
her=0A> queue.=A0 However there are ALTQ interactions and other=0A> mechani=
sms which have to be considered too making it a bit=0A> more involved.=0A> =
=0A> I asserted for many years that software-side queueing would=0A> be sub=
sumed by increasingly large DMA descriptor rings for=0A> the majority of de=
vices and configurations.=A0 However,=0A> this turns out not to have happen=
ed in a number of=0A> scenarios, and so I've revised my conclusions there.=
=A0 I=0A> think we will continue to need to support transmit-side=0A> buffe=
ring, ideally in the form of a set of "libraries" that=0A> device drivers c=
an use to avoid code replication and=0A> integrate queue management feature=
s fairly transparently.=0A> =0A> I'm a bit worried by the level of copy-and=
-paste between=0A> 10gbps device drivers right now -- for 10/100/1000 drive=
rs,=0A> the network stack contains the majority of the code, and the=0A> re=
sponsibility of the device driver is to advertise hardware=0A> features and=
 manage interactions with rings, interrupts,=0A> etc.=A0 On the 10gbps side=
, we see lots of code=0A> replication, especially in queue management, and =
it suggests=0A> to me (as discussed for several years in a row at BSDCan an=
d=0A> elsehwere) that it's time to do a bit of revisiting of=0A> ifnet, pul=
l more code back into the central stack and out of=0A> device drivers, etc.=
=A0 That doesn't necessarily mean=0A> changing notions of ownership of even=
t models, rather,=0A> centralising code in libraries rather than all over t=
he=0A> place.=A0 This is something to do with some care, of=0A> course.=0A>=
 =0A> Robert=0A=0A=0AMore troubling than that is the notion that the same c=
ode that's suitable=0Afor 10/100Gb/s should be used in a 10Gb/s environment=
. 10Gb/s requires a=0Acompletely different way of thinking.=0A=0ABC



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1354816923.71234.YahooMailClassic>