Date: Mon, 11 Aug 2014 17:42:27 -0700 From: Adrian Chadd <adrian@freebsd.org> To: Navdeep Parhar <np@freebsd.org> Cc: Alan Cox <alc@freebsd.org>, Victor Balada Diaz <victor@bsdes.net>, Sushanth Rai <sushanth_rai@yahoo.com>, "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org> Subject: Re: Support for zero copy sockets Message-ID: <CAJ-VmomEU7MAB1m_%2BQv9PMD6Yv9PbezDG2ncS1h1cQ1n_yQn=A@mail.gmail.com> In-Reply-To: <53E91578.3060209@FreeBSD.org> References: <1407171616.44440.YahooMailBasic@web181702.mail.ne1.yahoo.com> <20140811082610.GF7828@equilibrium.bsdes.net> <CAJ-VmonTYPz7qJ3WG52ADp69FdYQkKQ6D_DOM1piEybGwgOmWA@mail.gmail.com> <CAJUyCcOmxwBqbtWUgVLw1z%2BbNzrd9jw76GYeKWwDgv17=4g-kw@mail.gmail.com> <53E91578.3060209@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 11 August 2014 12:11, Navdeep Parhar <np@freebsd.org> wrote: > There is zero copy receive (aka Direct Data Placement -- DDP) in the TOE > driver that accompanies cxgbe(4). I have a tx zero copy implementation > for it as well (this is not in -current right now). But all this code > is chip specific and applies only to TCP connections that are handled > by the TOE driver. It doesn't rely on COW or page flipping. > > The reason I'm mentioning all of this here is that if anyone is thinking > of working on proper zero copy awareness (and APIs) at the socket layer > then count me in as an interested party. I'm not going to get into it just for now, as I have enough on my FreeBSD plate to do already. However, the thing that always irked me about the hardware based solutions is that they're great for a subset of problems - typically small sets of sockets. The real interesting problem for me is how to make it work for say, 500,000 or more concurrent TCP sessions. I can see a method of doing zero-copy writes to the network stack - look at what the AIO code does in the physical IO path for doing writes. It wires down the memory and stuffs it into the buffer. The thing I haven't yet sorted out is what to do about mappings in case kernel code wants to peek at the socket data payload for whatever reason. (And yes, reads are still a problem.) -a
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-VmomEU7MAB1m_%2BQv9PMD6Yv9PbezDG2ncS1h1cQ1n_yQn=A>