Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 19 Jun 2000 16:25:22 -0600
From:      "Kenneth D. Merry" <ken@kdm.org>
To:        Mike Smith <msmith@FreeBSD.ORG>
Cc:        arch@FreeBSD.ORG
Subject:   Re: kblob discussion.
Message-ID:  <20000619162522.A81338@panzer.kdm.org>
In-Reply-To: <200006192149.OAA09723@mass.osd.bsdi.com>; from msmith@FreeBSD.ORG on Mon, Jun 19, 2000 at 02:49:46PM -0700
References:  <20000619151517.A80732@panzer.kdm.org> <200006192149.OAA09723@mass.osd.bsdi.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jun 19, 2000 at 14:49:46 -0700, Mike Smith wrote:
> > On Mon, Jun 19, 2000 at 14:03:07 -0700, Mike Smith wrote:
> > > 
> > > I think this entire discussion has gone just a little bit too far. 
> > 
> > I don't think so.  You and Alfred seem to be convinced of the merits of
> > doing this in a specific (versus generic) manner, but Jonathan and I
> > obviously haven't been convinced.
> 
> I didn't say "you think this conversation has gone too far" - your 
> position is pretty obvious.  I was simply stating mine, ok?

Okay.

> > > The basic kblob architecture is sound, and does what most people are 
> > > looking for in this context.  IMO, it should probably be called 
> > > socketblob, or something else equally boring, but that's neither here nor 
> > > there.
> > 
> > What are people planning on doing with this API?  Is this intended as a web
> > server speedup of some sort?
> 
> Yes.  There's a niche for webservers that serve a relatively small 
> quantity of static content, very rapidly.  Think "small images".

Okay.

> > You could have the same effect as kblob with a generic API.
> 
> The kblob interface *is* a "generic API".  It is in no way "specific" to 
> this application.

We're using two different definitions of generic here.  What you mean by
"generic" is that more than one application can use the API.  That is
certainly true.

What I mean by "generic" is that the API is useful for more than sending a
small set of static content over and over again.

> > > Receiving into a kernel-side buffer is somewhat pointless, really.  You 
> > > want stuff to go into userspace.  This is an output accelerator, not a 
> > > new I/O method.
> > 
> > It isn't pointless at all.
> > 
> > Again, you want a specific API, I'd rather see a generic API that could do
> > what kblob can, 
> 
> You haven't described any such generic API. Zero-copy user-space network 
> I/O doesn't even begin to consider the issues that kblob addresses.

Do I have to come up with a complete competing implementation in order for
my comments to be considered?

In any case, I have done just that, but as I mentioned in my mail to
Alfred, it won't be released, since it was specific to a software and
hardware environment.  If you want a description of it, I'll be glad to
provide one.

The concept, however, isn't specific to a particular platform, and in fact
similar things have been proposed and implemented.  Sun's fbufs proposal,
IO-Lite, and Jonathan's "zbuf" API that he posted to committers are all
more in the range of the type of API I think would work.

All three of those interfaces are probably superior to mine in terms of the
elegance of the interface.  Why not just use those as examples of a zero
copy API?

The only difference performance-wise between any of those APIs and kblob
would be mapping the data from the user's virtual address space to the
kernel's virtual address space.  Even that isn't strictly necessary, see my
comments below.

> > What I'm not sure of, though, is why we're proposing something that's quite
> > pointedly a narrow interface, when we could get much the same performance
> > out of a generic API, along with wider usability.
> 
> Actually, in the target application, I wouldn't expect anything like the 
> same sort of performance.  With all the VM operations involved in the 
> zero-copy API, it's well-suited to applications that generate dynamic 
> content, but still rather more expensive for static content.

The only VM operation I can see on the sending side of any one of the APIs
above would be mapping it into the kernel's KVA.  No COW mapping would be
involved.

If you've got an adapter that does checksum offloading, you don't even need
to do that.  You can just go straight to physical addresses for the
payload, since the checksumming routines are the only routines in the
kernel that generally need to touch the payload on the sending side.

The only place where you would want to access the payload in that case is
for bpf, and you can get around that by either mapping things into kva
for bpf long enough to copy the data out, or by having bpf skip over the
physically addressed segments of the mbuf chain.

> > > One point that Alfred may have overlooked is that apart from the resource 
> > > limit issues, kblob could be implemented entirely as a loadable syscall.
> > > (Then you could just compile the limit in, for slightly reduced 
> > > flexibility.)  This would address the "don't commit to a restricted API" 
> > > issues while still getting the code out and used where it's needed.
> > 
> > Again, can you elaborate on the uses of kblob?  Is there code that uses it?
> 
> Any service that, in response to a query, needs to send from a small 
> collection of static data down a network socket.  The common use is static
> content serving for http.

I think it would be interesting to see data on the performance penalty of
the user->kernel virtual address mapping.

i.e. how big of a penalty is it?  Are there some hard numbers?  One way to
do this would be to do the user->kernel mapping every time someone does a
kblobsend(), and compare that to the stock kblob code.

This seems to be the only major sticking point that I see in this whole
discussion.  If y'all can show that there is a major performance penalty to
doing the mapping, then the need for something like kblob becomes more
evident.

If the mapping overhead just gets lost in the noise, then the need for
kblob isn't as clear.

Ken
-- 
Kenneth Merry
ken@kdm.org


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20000619162522.A81338>