From owner-freebsd-net@FreeBSD.ORG Wed Oct 30 17:48:32 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id EB720F4D; Wed, 30 Oct 2013 17:48:32 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-qe0-x232.google.com (mail-qe0-x232.google.com [IPv6:2607:f8b0:400d:c02::232]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 8690F259A; Wed, 30 Oct 2013 17:48:32 +0000 (UTC) Received: by mail-qe0-f50.google.com with SMTP id 1so1043614qee.37 for ; Wed, 30 Oct 2013 10:48:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=zXkuCpJ6VkRxJpBShr9XM63P1+xOmVxiX2y3yJCdXcE=; b=zt+iL6mD3emcJmGfqsKFTJK6HTrqfbfGotsKUlBU1yqzuK6u5qk8UF3Mx/Uzsah17c vXP2/Cugblb43Ypkl9I80KCJfpPAOf1kYJAqxV/eXpWwpVTk3YaHShVnuTptBcXcjFGL w9ztQvHKy04r2D3dOPAPM/48Svxt/Gg8x/52yQoDv3f6+wA1elINachcmgykmHF7PXdi PYVyFoD9DzV9GPN0qav8+XlqBguHz5IDPBMSrZreFc90pMeGIkjWFApp+ngjEHUgmRa0 ON/kFOOryfBjOOJNF/dlTGBA9OARGyp45tsKrD8qUOwgKrk4W/s/GdvHRFwcDMm3dKm2 35+g== MIME-Version: 1.0 X-Received: by 10.49.59.115 with SMTP id y19mr8596891qeq.8.1383155311679; Wed, 30 Oct 2013 10:48:31 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.224.207.66 with HTTP; Wed, 30 Oct 2013 10:48:31 -0700 (PDT) In-Reply-To: <5270F101.6020701@freebsd.org> References: <40948D79-E890-4360-A3F2-BEC34A389C7E@lakerest.net> <526FFED9.1070704@freebsd.org> <52701D8B.8050907@freebsd.org> <527022AC.4030502@FreeBSD.org> <527027CE.5040806@freebsd.org> <5270309E.5090403@FreeBSD.org> <5270462B.8050305@freebsd.org> <5270F101.6020701@freebsd.org> Date: Wed, 30 Oct 2013 10:48:31 -0700 X-Google-Sender-Auth: ERXLSL7s9c9TbRE1KgK-ujhtSl4 Message-ID: Subject: Re: MQ Patch. From: Adrian Chadd To: Andre Oppermann Content-Type: text/plain; charset=ISO-8859-1 Cc: "freebsd-net@freebsd.org" , Luigi Rizzo , Navdeep Parhar , Randall Stewart X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Oct 2013 17:48:33 -0000 On 30 October 2013 04:44, Andre Oppermann wrote: >> We can't assume the hardware has deep queues _and_ we can't just hand >> packets to the DMA engine. > >> >> >> Why? >> >> Because once you've pushed it into the transmit ring, you can't >> guarantee / impose any ordering on things. You can't guarantee that >> you can abort a frame that has been queued because it now breaks the >> queue rules. >> >> That's why we don't want to just have a light wrapper around hardware >> transmit queues. We give up way too much useful control. > > > The stack can't possibly know about all these differences in current > and future technologies and requirements. That's why this decision > should be pushed into the L3/L2 mapping/encapsulation and driver layer. That's why you split it. You allow the upper layers (things like altq) to track things like per-IP, per-traffic-class traffic and tag things appropriate. You then let some software queue implement the queue discipline and only drain frames to the hardware at a rate that's fast enough to keep up with the hardware, and no faster. Why? Because if you have new traffic come along from a new client, it may be higher priority than the traffic queued to the hardware. But it's at the same QoS level as what's currently queued to the hardware, or map to the same physical queue. So yes, we do need that split for a lot of cases. There will be bare-metal cases for highly low latency but if we implement the correct queue API here it'll just collapse down to either NULL, or just the existing software queue in front of the DMA rings to avoid locking overhead. Thanks, -adrian