Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 9 Mar 1998 02:11:58 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        marcs@znep.com
Cc:        mike@smith.net.au, hackers@FreeBSD.ORG
Subject:   Re: kernel wishlist for web server performance
Message-ID:  <199803090211.TAA13723@usr08.primenet.com>
In-Reply-To: <Pine.BSF.3.95.980307225453.2799O-100000@alive.znep.com> from "Marc Slemko" at Mar 7, 98 11:12:52 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> I don't think so.  Trying to do anything else is an ugly hack.  See the HP
> paper I mentioned for some of the details on why.
> 
> Let me put it this way: how else do you propose to do copy avoidance to
> avoid an extra copy going into the mbufs?  The data must go from the
> buffer cache to the network without any copy other than to the network
> card itself.  Why is your other method of doing this any less of a hack? 

Put the header in a page by itself and butt it up against the data.
We did this for the Pathworks for VMS (NetWare) product.  The
copy to the card buffer is gathered.

Basically, you are saving a direct reference to the mbuf by using
an indirect reference to the data you want to send in the buffer
cache (or in the VM, more correctly).

I don't see why you couldn't do this in the mmap/write case, frankly,
without needing to use a "TransmitFile" type operation.  In either
case, you would need to modify the networking so that buffers could
be accessed by indirect reference, and usage counts are done appropriately
on copy-complete/DMA complete (depending on how the data gets gathered
to the network card).

The *one* thing you will save is per-write user to kernel address
translation.  To get this, you will add the complication of needing
to (potentially) take a page fault on the backing vp for the file to
get more pages to send, if the whole thing isn't in core.  You could
do this in the write(2) case by having a faulting uio routing (uiofault
instead of uiomove?).  You would also ned a callback function to unlock
the pages after the DMA/copy -- you couldn't simply release things.


> Yes, although I have heard somewhat convincing arguments that on hardware
> with good enough support for context switching between threads, you don't
> actually gain that much (if anything) from AIO over just creating more
> threads.  I'm not really convinced though, and it is highly dependent on
> how the OS implements that stuff.

You would need quantum-thread-group-affinity to beat AIO.  Specifically,
if I made a blocking operation on a kernel thread after using 1/3 of my
quantum, I'd want the remaining 2/3rd's to go to another kernel thread
within my process (thread group), in order to save the page mapping
change overhead part of the context switch.

This leads to all sorts of starvation issues, etc., which can be dealt
with, but aren't easy.  You can get similar performance from using user
space threads in combination with AIO, without hving to implement the
affinity.

For SMP, there are additional L1 cache issues.  You want thread CPU
affinity so you don't blow out your L1 cache acessing thread local
storage, and you want process CPU affinity so you don't blow it out
accessing heap data.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199803090211.TAA13723>