From owner-freebsd-arch  Mon Jun 19 15:34: 0 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20])
	by hub.freebsd.org (Postfix) with ESMTP
	id 5E05737B7D8; Mon, 19 Jun 2000 15:33:48 -0700 (PDT)
	(envelope-from bright@fw.wintelcom.net)
Received: (from bright@localhost)
	by fw.wintelcom.net (8.10.0/8.10.0) id e5JMXPt21191;
	Mon, 19 Jun 2000 15:33:25 -0700 (PDT)
Date: Mon, 19 Jun 2000 15:33:25 -0700
From: Alfred Perlstein <bright@wintelcom.net>
To: Jonathan Lemon <jlemon@flugsvamp.com>
Cc: Mike Smith <msmith@FreeBSD.ORG>, arch@FreeBSD.ORG
Subject: Re: kblob discussion.
Message-ID: <20000619153325.D17420@fw.wintelcom.net>
References: <20000619164329.F37084@prism.flugsvamp.com> <200006192156.OAA09767@mass.osd.bsdi.com> <20000619172041.G37084@prism.flugsvamp.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2i
In-Reply-To: <20000619172041.G37084@prism.flugsvamp.com>; from jlemon@flugsvamp.com on Mon, Jun 19, 2000 at 05:20:41PM -0500
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

* Jonathan Lemon <jlemon@flugsvamp.com> [000619 15:17] wrote:
> On Mon, Jun 19, 2000 at 02:56:47PM -0700, Mike Smith wrote:
> > 
> > Kblob is not a new output method.  It's a bolt-on to the 'write' 
> > interface, as opposed to, say, zero-copy I/O.
> 
> Yes, it is a "bolt-on" thing, not a well designed item.

Yes it is, it serves the purpose for which it was created.

> And as far as "zero-copy" goes, I would consider that a fairly overloaded
> term.  Here, I'm applying it to anything that saves copies, not just 
> a magical method of getting data between user and kernel spaces.

I'm sorry, I'm unable to code "anything that saves copies".

> > > And for a lot of cases
> > > you don't need stuff to actually go to user space.  In many situations,
> > > you will receive data, perform operations on a very small part of it,
> > > and then send the data elsewhere, whether to disk or network.
> > 
> > However, for a user-space application to do this, you want it to come out 
> > to user-space, right?
> 
> No.   You cut out what I wrote - you may only need a very small part
> of the data to actually go to user space.

So then use my soon to be committed accept filters to accomplish this.

> > Irregardless, this is irrelevant in this context.
> 
> Not irrelevant; you're not looking at the big picture here.

There is no big picture, there are a lot of connections and the data
I want to send down them.

> > > Also, if I want to send data from disk -> network, with the kblob
> > > API, I essentially have the following sequence of calls:
> > >
> > > 	read(fd, address, length)
> > > 	kblob(address, length)
> > > 	kblobsend(...)
> > > 
> > > which is to say, the data is copied from
> > > 	disk -> userspace   (assuming direct I/O)
> > >  	userspace -> kernel,
> > > 	kernel -> network.
> > > 
> > > Why shouldn't I be able to bypass second copy entirely and do:
> > > 	disk -> kernel     (data now in kernel 'blob')
> > > 	kernel -> network
> > 
> > Because this isn't what you use kblob for.  That's what you use sendfile 
> > for.
> 
> But sendfile doesn't leave the data cached in kernel space for reuse,
> so it isn't suitable for this.

Yes it does.

> > The entire use of the kblob API is:
> > 
> > 	user -> kernel
> > 
> > 	kernel -> network
> > 	kernel -> network
> > 	kernel -> network
> > 	kernel -> network
> > 	kernel -> network
> > 	kernel -> network
> > 	kernel -> network
> > etc.,
> 
> And you have to get the data into user space first, when the data
> may be coming from disk.  Thus you incur one copy into userspace 
> first, when a better API might avoid it altogether.  I *know* what
> kblob is designed for, and I'm telling you that the API is limiting,
> and will not be suitable for what I have.

Use sendfile and mlock.

> > > Again, this is not addressed in the kblob API.  Don't throw sendfile()
> > > up as an example, because I want the data to be kept in the kernel
> > > after being read from disk.
> > 
> > It is.  You remember that thing called the buffer cache?
> 
> Not applicable here.  If it was, why the heck aren't you using sendfile
> in the first place and relying on the kernel to keep the file cached?
> Hmm?  If that was the way things worked, why isn't sendfile() good enough?

Because of all the damn overhead of vm operations that you won't
be satisfied about until I bog down kblob with them.  It's not
going to happen.

> > > > > I'd be more in favor of doing it right the first time, rather than
> > > > > continually revising and extending the interface.
> > > > 
> > > > As it stands, kblob is "right".  It's lean and to-the-point.  I'd rather 
> > > > not see it undergo second-system syndrome before it's even made it to 
> > > > first base.  If you plan to implement an efficient zero-copy network I/O
> > > > interface, then it should be done from scratch, not as a wart on the side 
> > > > of what is really fairly specific feature.
> > > 
> > > I disagree on your definition of "right".  I would argue that if we're
> > > going to implement an I/O accelerator of any form, it should not be so
> > > specific purpose that it precludes any other possible applications.
> > > 
> > >   - How do I receive data _into_ a kblob (either from a network or a disk?)
> > 
> > You don't.  It's not relevant to this application.
> > 
> > >   - Once data is in a kblob, is there any possibility of editing it?
> > >     Do I have to throw it all away and reload if I want to make a minor
> > >     change to the data?
> > 
> > It's static content.
> > 
> > >   - Can I retrieve data from a kblob?
> > 
> > No.  The kblob is a send-only acceleration buffer.
> > 
> > >   - Can kblobs be transferred to disk?
> > 
> > No.  Since the data in a kblob came from userspace, and can't be modified 
> > in the kernel, this would be pointless.
> 
> Exactly.  You're looking at kblob as a "static send-only buffer which
> comes from user space".  I'm looking at kblob as an "in-kernel object
> cache".  The two views are _NOT_ that far apart, and with a little API
> help, the current code can be made much more flexible.  Why are you 
> opposed to that?

I'm not opposed to it at all, feel free to work on the code as much
as you want as long as you don't kill the fastpath it offers.

> > Again, folks; kblob is an optimisation for a very common performance 
> > case, not a network engineer's wet dream.  It's meant to address a 
> > real-world problem in an efficient fashion.
> 
> It is a narrow optimization for a specific problem.

Sending data fast is a pretty large problem to solve.

> > If you have a Grand Plan for something else, that's great - I'd recommend 
> > you keep working on it.  In the meantime, there are a lot of people that 
> > can do a lot of useful things with kblob as-is, and knocking it down just 
> > because it's not _your_ idea of the Perfect In-Kernel Network Container 
> > Object is kinda silly.
> 
> Well, with a little more work, we can get something that is more
> applicable to a wider audience, rather than focusing on your single
> specific performance problem.  Putting on blinders to the wider 
> problem simply because your current code doesn't address it strikes
> me as kind of foolish.

The only problem I see is that there's just no pleasing some people.
:)

I'm not writing this code for your application Johnathan, I'm
writing it for myself and for the other people that need this
functionality.

I've never been pain about 'MAINTAINER' so I don't see adding to
the interface in the future as much of a problem.   Perhaps you
can give me a TODO list that I can include at the top of kern_blob.c?

thanks,
-Alfred


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message