Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 28 May 1997 10:15:50 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        cmsedore@mailbox.syr.edu (Christopher Sedore)
Cc:        black@zen.cypher.net, rssh@cki.ipri.kiev.ua, FreeBSD-Hackers@FreeBSD.ORG
Subject:   Re: async socket stuff
Message-ID:  <199705281715.KAA01908@phaeton.artisoft.com>
In-Reply-To: <Pine.SOL.3.95.970528095615.11635A-100000@gamera.syr.edu> from "Christopher Sedore" at May 28, 97 10:32:32 am

next in thread | previous in thread | raw e-mail | index | archive | help
> > 2) it is not a complete lagnuage interpreter, it is simply a compact 
> > system for dynamically adding system call to the kernel by aggregating 
> > existing system calls.  contrast this with your approach which requires 
> > adding special purpose code to the kernel for EVERY "common" sequence of 
> > syscalls.
> 
> I understood this, but its case is not made by the docs I read.  It looks
> like a nice thing, but I'd like to see more discussion and an
> implementation before I support it generally.

Actually, you shouldn't need a system call for this.

Plus it sounds like streams!  8-).

> My point was that the only thread packages available now (corrections
> welcome) just use non-blocking I/O and select() to simulate threads in
> userland. 

There is a kernel threading package which was contributed, but never
integrated, actually.  Personally, I dislike kernel threading because
of the quantum ownership problems it introuces, and the N:M mapping
problems for user space to kernel thread mapping for N>M.  Not to
mention CPU affinity and cache issues.


My own reasoning for an async call gate is that user space threading
is all of one general model: trade a blocking call for a non-blocking
call plus a thread context switch.  This has been done over and over;
the best reference I have is the paper "User Space Threading and
Register Windows on SPARC", a paper out of the University of Washington,
and the basis of the SunOS 4.x LWP user space threading library (liblwp).

The problem with call conversion is that it's generally implemented on
aioread/aiowrite/aiowait/aiocancel.  The problem with this is, as the
original poster was trying to work around, the only supported operations
which may be converted are read and write (and a reschedulable one-shot
timer using aiowait, and assuming that non-timeout code takes 0 time to
execute).

A general async call gate mechanism works around this problem, and
given flags on the sysent entries, is extrememly low code overhead
to implement.


> Time.  I probably could implement transmitfile for FreeBSD in a few weeks
> with everything else I have on my plate.  I could not reengineer FreeBSD's
> I/O system in 10x that timeframe.

A kernel call context LRU would be relatively easy.  It would require
divorcing the kernel stack from the user process.  This is something
that the SMP people need anyway to implement scheduler affinity and
something the realtime people need for kernel preeemption.  So it's
a general win in any case.


> > a small syscall aggregator in the kernel would take up perhaps 5-10k and 
> > provide limitless applications.  your solution solves a single current 
> > problem and totally ignores the rest of the world.
> 
> No disagreement here.  I'd be happy to see your solution implemented, and
> I'd use it if it provided benefit to me.

I find the idea fascinating; but I'm worried about misuse.  I'd like to
see restricted VM's for user space kernel code as a more general soloution
which would enforce write faulting in a protected mode VM; otherwise
there are all of the bad address issues of copyin/copyout and all of
the race windows of the preverification done in Linux, which does not
verify that the user address space has not changed prior to copying
data out (an admittedly small window, but a window nonetheless).  It
is unfortunate that in that direstion lies MACH (madness 8-)).


> Maybe so, but I don't see how it negatively affects them in any serious
> way.  I do see how it can offer significant benefits to people running
> web/ftp/news servers where FreeBSD is trying to gain usage.
> 
> My main point is that it cannot be implemented outside the kernel with
> what I would consider to be reasonable efficiency.  If the async I/O stuff
> eliminates the buffer copies then some portion of my argument becomes
> moot.  Even then I might still argue for its existence.

I think readv/writev might be the answer, without mmap(), but it's
rather scary to contemplate.  The agregator is *almost* enough to
enable me to provide a "<CR><LF>.<CR><LF>" quoting line discipline,
which is a good thing... but there are still trust issues unless
there is zone enforcement (ala NetWare NLM's or MACH... brrrrr).


					Regards,
					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199705281715.KAA01908>