From owner-freebsd-net@FreeBSD.ORG Thu Dec 6 18:02:10 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BBAD56E3 for ; Thu, 6 Dec 2012 18:02:10 +0000 (UTC) (envelope-from barney_cordoba@yahoo.com) Received: from nm25-vm3.bullet.mail.ne1.yahoo.com (nm25-vm3.bullet.mail.ne1.yahoo.com [98.138.91.155]) by mx1.freebsd.org (Postfix) with ESMTP id 66C758FC17 for ; Thu, 6 Dec 2012 18:02:10 +0000 (UTC) Received: from [98.138.90.49] by nm25.bullet.mail.ne1.yahoo.com with NNFMP; 06 Dec 2012 18:02:04 -0000 Received: from [98.138.87.6] by tm2.bullet.mail.ne1.yahoo.com with NNFMP; 06 Dec 2012 18:02:04 -0000 Received: from [127.0.0.1] by omp1006.mail.ne1.yahoo.com with NNFMP; 06 Dec 2012 18:02:04 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 132418.83920.bm@omp1006.mail.ne1.yahoo.com Received: (qmail 81634 invoked by uid 60001); 6 Dec 2012 18:02:04 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1354816924; bh=ywQcO9yhsuj8zQKuy3VE9yte4qK0pkGhZZG+Djp7cpE=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=m4Dv9cw3Xh7wxbtplKyVZHVCA57FGPKRvUNvwTYzNSK/7PMXEVE94+2GBn/Nfr8dtRPwnmRkUciwFwMCoSADnpUowmnAouB6QR4jBCx5ckJGHzpAMK7gBPd2pB2bRArGfrirYBDuP41h4zDgN27oedq2pOMXFEQUTRTf9byur+w= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=1vTQm3zPWTNy1/8+ssk2tuvSYprD346HtV2vljx1hSNwC4PWxxYwcURFiPgVQKPPf7T/wmeJPrAYPgNPZsETFdWxnHKTpiQSnOiWLI409HSQx9oR7S+ymzFq8UmReeGjbefsgrCLT5Cw3r9IKM0JNPg6Agoz92cSLu44YSCkDjg=; X-YMail-OSG: A5Ce9xsVM1nP3XFfhl7gCs4S0Frcz0WQFNjabKs7z4xs49N yFJxpw94T_iQz9auJNa5MXMOYJa0RPNNyBlOZ3wpKS4X991Nm2nQLXSJqDJg pkvSLhfxcHpzol5lmCi23EFm73_YpjsVZwy.Si9iDUG4Fd1hvQxIifijlnnX 4QcW7rUvKVe8AIyjQFhpRux5WAChrMvnobZd.oU5bk40zJXgMfd2XtvBnH8k OrHaUpACeHrFM1Ih9bTyVOadJSbaRHUfHzyDee2BkBxzmPqvTdJwHtFaNpTB v6s7a0NJtZwt8nPovx.9I7buoo.60mmXjSY6ie9gUHtabqSlGO0_SRGZ0oje b9t5JcznGKBiHYILrEMhe2uZN_fk07UJah0ZrAikyOv5WpJ1NoMwU2Xtt2tC jZJppiyuLVIUrDaDG0tvDvn12W2iuOE1f7qbXT6WwtQMYgf0nF2hHhJByxAG x7JIdUTC_5X98MlEi3VPyfFuSnFalNJnlGpBSJJ_kN1LPyS8Td66OnhoVfnx vndlJOYbDtcc_f1kOVFMLSJpfMQ7Y_Q-- Received: from [174.48.128.27] by web121601.mail.ne1.yahoo.com via HTTP; Thu, 06 Dec 2012 10:02:03 PST X-Rocket-MIMEInfo: 001.001, CgotLS0gT24gVGh1LCAxMi82LzEyLCBSb2JlcnQgV2F0c29uIDxyd2F0c29uQEZyZWVCU0Qub3JnPiB3cm90ZToKCj4gRnJvbTogUm9iZXJ0IFdhdHNvbiA8cndhdHNvbkBGcmVlQlNELm9yZz4KPiBTdWJqZWN0OiBSZTogTGF0ZW5jeSBpc3N1ZXMgd2l0aCBidWZfcmluZwo.IFRvOiAiQW5kcmUgT3BwZXJtYW5uIiA8b3BwZXJtYW5uQG5ldHdvcnguY2g.Cj4gQ2M6ICJCYXJuZXkgQ29yZG9iYSIgPGJhcm5leV9jb3Jkb2JhQHlhaG9vLmNvbT4sICJBZHJpYW4gQ2hhZGQiIDxhZHJpYW5AZnJlZWJzZC5vcmc.LCABMAEBAQE- X-Mailer: YahooMailClassic/15.1.1 YahooMailWebService/0.8.128.478 Message-ID: <1354816923.71234.YahooMailClassic@web121601.mail.ne1.yahoo.com> Date: Thu, 6 Dec 2012 10:02:03 -0800 (PST) From: Barney Cordoba Subject: Re: Latency issues with buf_ring To: Robert Watson In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Dec 2012 18:02:10 -0000 =0A=0A--- On Thu, 12/6/12, Robert Watson wrote:=0A=0A= > From: Robert Watson =0A> Subject: Re: Latency issues= with buf_ring=0A> To: "Andre Oppermann" =0A> Cc: "Ba= rney Cordoba" , "Adrian Chadd" , "John Baldwin" , freebsd-net@freebsd.org=0A> Date: Thu= rsday, December 6, 2012, 4:39 AM=0A> On Tue, 4 Dec 2012, Andre Oppermann=0A= > wrote:=0A> =0A> > For most if not all ethernet drivers from 100Mbit/s the= =0A> TX DMA rings are so large that buffering at the IFQ level=0A> doesn't = make sense anymore and only adds latency.=A0 So=0A> it could simply directl= y put everything into the TX DMA and=0A> not even try to soft-queue.=A0 If = the TX DMA ring is full=0A> ENOBUFS is returned instead of filling yet anot= her=0A> queue.=A0 However there are ALTQ interactions and other=0A> mechani= sms which have to be considered too making it a bit=0A> more involved.=0A> = =0A> I asserted for many years that software-side queueing would=0A> be sub= sumed by increasingly large DMA descriptor rings for=0A> the majority of de= vices and configurations.=A0 However,=0A> this turns out not to have happen= ed in a number of=0A> scenarios, and so I've revised my conclusions there.= =A0 I=0A> think we will continue to need to support transmit-side=0A> buffe= ring, ideally in the form of a set of "libraries" that=0A> device drivers c= an use to avoid code replication and=0A> integrate queue management feature= s fairly transparently.=0A> =0A> I'm a bit worried by the level of copy-and= -paste between=0A> 10gbps device drivers right now -- for 10/100/1000 drive= rs,=0A> the network stack contains the majority of the code, and the=0A> re= sponsibility of the device driver is to advertise hardware=0A> features and= manage interactions with rings, interrupts,=0A> etc.=A0 On the 10gbps side= , we see lots of code=0A> replication, especially in queue management, and = it suggests=0A> to me (as discussed for several years in a row at BSDCan an= d=0A> elsehwere) that it's time to do a bit of revisiting of=0A> ifnet, pul= l more code back into the central stack and out of=0A> device drivers, etc.= =A0 That doesn't necessarily mean=0A> changing notions of ownership of even= t models, rather,=0A> centralising code in libraries rather than all over t= he=0A> place.=A0 This is something to do with some care, of=0A> course.=0A>= =0A> Robert=0A=0A=0AMore troubling than that is the notion that the same c= ode that's suitable=0Afor 10/100Gb/s should be used in a 10Gb/s environment= . 10Gb/s requires a=0Acompletely different way of thinking.=0A=0ABC