Date: Wed, 15 Sep 1999 12:45:29 -0400 (EDT) From: Christopher Sedore <cmsedore@mailbox.syr.edu> To: Jayson Nordwick <nordwick@scam.xcf.berkeley.edu> Cc: freebsd-hackers@FreeBSD.ORG Subject: Re: High Performance I/O (more) Message-ID: <Pine.SOL.4.10.9909151235280.1425-100000@rodan.syr.edu> In-Reply-To: <19990915070800.34512.qmail@scam.xcf.berkeley.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 15 Sep 1999, Jayson Nordwick wrote: > I did research this weekend on high performance I/O. I looked at differerent > approaches and to me they all appear the same (I know that I will get some > flamage for this). The two most prominent models that I saw were IO > Completion Ports and Synchronous Events (such as the Gaurav > http://www.cs.rice.edu/~gaurav/papers/usenix99.ps). > > I think that both of these models are basically the same. They both have > an event queue that you pick up events from. The only way that they differ > is in what they call an event. Completion ports take asynchronous opperations > and queue an event when the opperation completes (hence the name). Synchronous > events do the opposite: they queue an event when an opperation is possible > and then the synchronous (usually, non-blocking) opperation is performed. > From this, you can decouple and event queue from what you call an event. > > >From what I can see either model will give roughly the same performance, > as they both do roughly the same amount of work. The one benefit that seems > to exist for the Completion Ports model is that there are fewer contex > switches. > > Now, looking at POSIX.1b signals and signal queues and getting some > information from Stephen Tweedie it looks like completion ports are doable > without anything new, I think that I have decided. > > If you find an available signal, set the handler for it, the block it, > this signal number now effectively becomes the completion port. You then > can fcntl() a file descriptor with F_SETSIG and the signal number. Then to > fetch the blocked signals, use sigwaitinfo(). I guess you could also use > aio_{read,write}() and set sigevent appropriately. This actually seems > preferable since you can then use aio_return() to find the return value > out and use aio_cancel() to cancel the request if wanted. > > The one drawback that I see to this is that it can only really handle > aio_{read,write}() and {read,write}()/fcntl(). Any other events such as > thread/child deaths cannot really be worked into this scheme unless you > could set the signal they deliver on termination. > > If you really wanted to, you could have signals delivered for the ability > to read/write to a file descriptor and then you would have Gaurav's model. > > Basically, unless anybody can see anything wrong with this get to work > implementing! As you note, IOCPs and the event scheme suggested by Gaurav et al have some differences related to whether they occur after an async operation completes, or when a descriptor is ready for a particular activity. The differences may not seem important, but for implementing high performance io systems, they probably do matter. The signal approach has some limitations, in that (correct me if I am wrong), FreeBSD doesn't have realtime signals yet, and POSIX specifies how signals and aio operations are to work already. I read with interest a post a week or three back talking about some efforts in the Linux community to come up with a standardized way of doing this. My ideas for this are a little different than what I've seen proposed thus far, more along the lines of creating something that acts as both an event queue and a IOCP. Ideally this would be a descriptor that could be shared across processes (or threads), and could be accessed using read(). I don't much care for the suggestion that threads ought to have an event queue of their own--rather if you want a per-thread completion notification, simply create a descriptor for each thread that needs this function. What ever is created, it should be sufficiently extensible to allow for all the events we can imagine now, as well as being flexible for future enhancement. (FWIW, I've also been thinking that I might like to be able to submit aio requests by write()ing said descriptor. Just a thought.) I hope we'll see an update on where the Linux efforts are--I'd be interested in joining the conversation. -Chris To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.SOL.4.10.9909151235280.1425-100000>