From owner-freebsd-hackers@FreeBSD.ORG Tue Aug 12 00:42:28 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 53F26767; Tue, 12 Aug 2014 00:42:28 +0000 (UTC) Received: from mail-qc0-x233.google.com (mail-qc0-x233.google.com [IPv6:2607:f8b0:400d:c01::233]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E69D12AE0; Tue, 12 Aug 2014 00:42:27 +0000 (UTC) Received: by mail-qc0-f179.google.com with SMTP id m20so2215634qcx.24 for ; Mon, 11 Aug 2014 17:42:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=Lbkl6j84Y6Bn+dANLjsyYDW9oh15wo4oCj8VA7pdfHw=; b=nEaSareS4K9oqL2hpBP0GsDs3FRYmmyRYnV0SOqOujRd/lujF9cWwfVpQf3BswAPBK we3xJUTkg0VDfbkmW46waAqNfTnFKCSCV9LHd7TQvvxV51u54vfgxzOVyPX7DQvQ+wkp tIBKnv2JYQ3i+Re1LOMF6xjOr5lcrnQ6EtgumWnKQEAhHAOUosZNEVUT84FzDNrPZ30q bF8kNoUX/jmdLOPAj21fGxGDAS2hOFEwIORFZ8rivh/P3qCmk1ZJIfGFCi/UVs9FU52a OVUo5htpGZxg0C6yOyG7KgYy9oYe8c8CKEQpY8saRHDQMjKeGzGm2wWcbgQojled8kIY V0bA== MIME-Version: 1.0 X-Received: by 10.224.3.67 with SMTP id 3mr1404127qam.26.1407804147115; Mon, 11 Aug 2014 17:42:27 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.224.41.6 with HTTP; Mon, 11 Aug 2014 17:42:27 -0700 (PDT) In-Reply-To: <53E91578.3060209@FreeBSD.org> References: <1407171616.44440.YahooMailBasic@web181702.mail.ne1.yahoo.com> <20140811082610.GF7828@equilibrium.bsdes.net> <53E91578.3060209@FreeBSD.org> Date: Mon, 11 Aug 2014 17:42:27 -0700 X-Google-Sender-Auth: Md-XY1XvlhR4FuRMM_BQKBgHkiQ Message-ID: Subject: Re: Support for zero copy sockets From: Adrian Chadd To: Navdeep Parhar Content-Type: text/plain; charset=UTF-8 Cc: Alan Cox , Victor Balada Diaz , Sushanth Rai , "freebsd-hackers@freebsd.org" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Aug 2014 00:42:28 -0000 On 11 August 2014 12:11, Navdeep Parhar wrote: > There is zero copy receive (aka Direct Data Placement -- DDP) in the TOE > driver that accompanies cxgbe(4). I have a tx zero copy implementation > for it as well (this is not in -current right now). But all this code > is chip specific and applies only to TCP connections that are handled > by the TOE driver. It doesn't rely on COW or page flipping. > > The reason I'm mentioning all of this here is that if anyone is thinking > of working on proper zero copy awareness (and APIs) at the socket layer > then count me in as an interested party. I'm not going to get into it just for now, as I have enough on my FreeBSD plate to do already. However, the thing that always irked me about the hardware based solutions is that they're great for a subset of problems - typically small sets of sockets. The real interesting problem for me is how to make it work for say, 500,000 or more concurrent TCP sessions. I can see a method of doing zero-copy writes to the network stack - look at what the AIO code does in the physical IO path for doing writes. It wires down the memory and stuffs it into the buffer. The thing I haven't yet sorted out is what to do about mappings in case kernel code wants to peek at the socket data payload for whatever reason. (And yes, reads are still a problem.) -a