Date: Wed, 2 Apr 2003 17:57:40 -0800 (PST) From: Matthew Dillon <dillon@apollo.backplane.com> To: Terry Lambert <tlambert2@mindspring.com> Cc: current@freebsd.org Subject: Re: libthr and 1:1 threading. Message-ID: <200304030157.h331veVm087635@apollo.backplane.com> References: <20030402234016.1550D2A8A7@canning.wemm.org> <3E8B8631.67435BC8@mindspring.com>
next in thread | previous in thread | raw e-mail | index | archive | help
:How does this break the read() API? The read() API, when called
:on a NBIO fd is *supposed* to return EAGAIN, if the request cannot
:be immediately satisfied, but could be satisfied later. Right now,
:it blocks. This looks like breakage of disk I/O introducing a
:stall, when socket I/O doesn't.
:
:If this breaks read() semantics, then socket I/O needs fixing to
:unbreak them, right?
Oh please. You know very well that every single UNIX out there
operates on disk files as if their data was immediately available
regardless of whether the process blocks in an uninterruptable
disk wait or not. What you are suggesting is that we make our
file interface incompatible with every other unix out there... ours
will return EAGAIN in situations where programs wouldn't expect it.
Additionally, the EAGAIN operation would be highly non-deterministic
and it would be fairly difficult for a program to rely on it because
there would be no easy way (short of experiementation or a sysctl) for
it to determine whether the 'feature' is present or not.
Also, the idea that the resulting block I/O operation is then queued
and one returns immediately from way down deep in the filesystem device
driver code, and that this whole mess is then tied into select()/kqueue()/
poll(), is just asking for more non-determinism... now it would
depend on the filesystem AND the OS supporting the feature, and other
UNIX implementations (if they were to adopt the mechanism) would likely
wind up with slightly different semantics, just like O_NONBLOCK on
listen() sockets has wound up being broken on things like HPUX.
For example, how would one deal with, say, issuing a million of these
special non-blocking reads() all of which fail. Do we queue a million
I/Os? Do we queue just the last requested I/O? You see the problem?
The API would be unstable and almost certainly implemented differently
on each OS platform.
A better solution would be to implement a new system call, similar to
pread(), which simply checks the buffer cache and returns a short read
or an error if the data is not present. If the call fails you would
then know that reading that data would block in the disk subsystem and
you could back-off to a more expensive mechanism like AIO. If want
to select() on it you would then simply use kqueue with EVFILT_AIO and
AIO. A system call pread_cache(), or perhaps we could even use
recvmsg() with a flag. Such an interface would not have to touch the
filesystem code, only the buffer cache and the VM page cache, and
could be implemented in less then a day.
-Matt
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200304030157.h331veVm087635>
