Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 11 Aug 2014 17:42:27 -0700
From:      Adrian Chadd <adrian@freebsd.org>
To:        Navdeep Parhar <np@freebsd.org>
Cc:        Alan Cox <alc@freebsd.org>, Victor Balada Diaz <victor@bsdes.net>, Sushanth Rai <sushanth_rai@yahoo.com>, "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Subject:   Re: Support for zero copy sockets
Message-ID:  <CAJ-VmomEU7MAB1m_%2BQv9PMD6Yv9PbezDG2ncS1h1cQ1n_yQn=A@mail.gmail.com>
In-Reply-To: <53E91578.3060209@FreeBSD.org>
References:  <1407171616.44440.YahooMailBasic@web181702.mail.ne1.yahoo.com> <20140811082610.GF7828@equilibrium.bsdes.net> <CAJ-VmonTYPz7qJ3WG52ADp69FdYQkKQ6D_DOM1piEybGwgOmWA@mail.gmail.com> <CAJUyCcOmxwBqbtWUgVLw1z%2BbNzrd9jw76GYeKWwDgv17=4g-kw@mail.gmail.com> <53E91578.3060209@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 11 August 2014 12:11, Navdeep Parhar <np@freebsd.org> wrote:
> There is zero copy receive (aka Direct Data Placement -- DDP) in the TOE
> driver that accompanies cxgbe(4).  I have a tx zero copy implementation
> for it as well (this is not in -current right now).  But all this code
> is chip specific and applies only to TCP connections that are handled
> by the TOE driver.  It doesn't rely on COW or page flipping.
>
> The reason I'm mentioning all of this here is that if anyone is thinking
> of working on proper zero copy awareness (and APIs) at the socket layer
> then count me in as an interested party.

I'm not going to get into it just for now, as I have enough on my
FreeBSD plate to do already.

However, the thing that always irked me about the hardware based
solutions is that they're great for a subset of problems - typically
small sets of sockets. The real interesting problem for me is how to
make it work for say, 500,000 or more concurrent TCP sessions.

I can see a method of doing zero-copy writes to the network stack -
look at what the AIO code does in the physical IO path for doing
writes. It wires down the memory and stuffs it into the buffer.

The thing I haven't yet sorted out is what to do about mappings in
case kernel code wants to peek at the socket data payload for whatever
reason.

(And yes, reads are still a problem.)



-a



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-VmomEU7MAB1m_%2BQv9PMD6Yv9PbezDG2ncS1h1cQ1n_yQn=A>