From owner-freebsd-net@FreeBSD.ORG Tue Dec 4 21:31:26 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id F26E2819; Tue, 4 Dec 2012 21:31:25 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-we0-f182.google.com (mail-we0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 4FAC78FC19; Tue, 4 Dec 2012 21:31:24 +0000 (UTC) Received: by mail-we0-f182.google.com with SMTP id u54so2209471wey.13 for ; Tue, 04 Dec 2012 13:31:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=A+iA2cUXAhMGGI46BKigGPjZEnEKIEBPalVUClTApi4=; b=Uezhxn3hNNm7kZtALNJj2VS45tb18yKP4llDpfMt3OcwZCyQmmb9i+98lxs1Mk7aU8 /DrJmTi41DQFeA3dLhCwG7W+JrIT1/RjSmDbu2O+ewj5asBgjd02hWKrM9yXBD0UzfQ+ 9aYxr34ul5F9pjPN3v8/Tq7sccBwfRnKuIptOP6dBHnXwOX6xqcNz1to9TXMqbWDNc/S GxT+rdm6O0QZnZ0fCsEluXq02HjHoZ8wb0Qca5V2rf+c/XcNhZGABdsfU7zpCiuQV1OQ lmqKR4palTm2FupfA0yPjgm5KkzK21ib09nG7q2P2zFcSEAPVBTbCzcBulvLsqwjPgab G6MA== MIME-Version: 1.0 Received: by 10.216.139.140 with SMTP id c12mr5872057wej.46.1354656683288; Tue, 04 Dec 2012 13:31:23 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.217.57.9 with HTTP; Tue, 4 Dec 2012 13:31:23 -0800 (PST) In-Reply-To: <50BE56C8.1030804@networx.ch> References: <1353259441.19423.YahooMailClassic@web121605.mail.ne1.yahoo.com> <201212041108.17645.jhb@freebsd.org> <50BE56C8.1030804@networx.ch> Date: Tue, 4 Dec 2012 13:31:23 -0800 X-Google-Sender-Auth: CK7HGl4-msBdEVpC_rzmRv2l7J8 Message-ID: Subject: Re: Latency issues with buf_ring From: Adrian Chadd To: Andre Oppermann Content-Type: text/plain; charset=ISO-8859-1 Cc: Barney Cordoba , John Baldwin , freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Dec 2012 21:31:26 -0000 On 4 December 2012 12:02, Andre Oppermann wrote: > Our IF_* stack/driver boundary handoff isn't up to the task anymore. Right. well, the current hand off is really "here's a packet, go do stuff!" and the legacy if_start() method is just plain broken for SMP, preemption and direct dispatch. Things are also very special in the net80211 world, with the stack layer having to get its grubby fingers into things. I'm sure that the other examples of layered protocols (eg doing MPLS, or even just straight PPPoE style tunneling) has the same issues. Anything with sequence numbers and encryption being done by some other layer is going to have the same issue, unless it's all enforced via some other queue and a single thread handling the network stack "stuff". I bet direct-dispatch netgraph will have similar issues too, if it ever comes into existence. :-) > Also the interactions are either poorly defined or understood in many > places. I've had a few chats with yongari@ and am experimenting with > a modernized interface in my branch. > > The reason I stumbled across it was because I'm extending the hardware > offload feature set and found out that the stack and the drivers (and > the drivers among themself) are not really in sync with regards to behavior. > > For most if not all ethernet drivers from 100Mbit/s the TX DMA rings > are so large that buffering at the IFQ level doesn't make sense anymore > and only adds latency. So it could simply directly put everything into > the TX DMA and not even try to soft-queue. If the TX DMA ring is full > ENOBUFS is returned instead of filling yet another queue. However there > are ALTQ interactions and other mechanisms which have to be considered > too making it a bit more involved. net80211 has slightly different problems. We have requirements for per-node, per-TID/per-AC state (not just for QOS, but separate sequence numbers, different state machine handling for things like aggregation and (later) U-APSD handling, etc) so we do need to direct frames into different queues and then correctly serialise that mess. > I'm coming up with a draft and some benchmark results for an updated > stack/driver boundary in the next weeks before xmas. Ok. Please don't rush into it though; I'd like time to think about it after NY (as I may actually _have_ a holiday this xmas!) and I'd like to try and rope in people from non-ethernet-packet-pushing backgrounds to comment. They may have much stricter and/or stranger requirements when it comes to how the network layer passes, serialises and pushes packets to other layers. Thanks, Adrian