From owner-freebsd-bugs Thu Mar 9 15:10:29 1995 Return-Path: bugs-owner Received: (from majordom@localhost) by freefall.cdrom.com (8.6.10/8.6.6) id PAA25454 for bugs-outgoing; Thu, 9 Mar 1995 15:10:29 -0800 Received: from cs.weber.edu (cs.weber.edu [137.190.16.16]) by freefall.cdrom.com (8.6.10/8.6.6) with SMTP id PAA25446 for ; Thu, 9 Mar 1995 15:10:22 -0800 Received: by cs.weber.edu (4.1/SMI-4.1.1) id AA08011; Thu, 9 Mar 95 16:03:57 MST From: terry@cs.weber.edu (Terry Lambert) Message-Id: <9503092303.AA08011@cs.weber.edu> Subject: Re: QIC-80 problem To: bakul@netcom.com (Bakul Shah) Date: Thu, 9 Mar 95 16:03:57 MST Cc: freebsd-bugs@FreeBSD.org In-Reply-To: <199503092219.OAA20388@netcom17.netcom.com> from "Bakul Shah" at Mar 9, 95 02:19:28 pm X-Mailer: ELM [version 2.4dev PL52] Sender: bugs-owner@FreeBSD.org Precedence: bulk > There is only one input and one output to steam. Now > *ideally* we'd like to say: put input block n in buffer 0, > block n+1 in buffer 1, and so on, and oh, by the way, call > me when a buffer is filled or eof is reached. Similarly for > output side: when we are notified that block n is filled, we > tell the output side to write it out in sequence and tell us > when done so that we can recycle the buffer -- this is pretty > much what we do at driver level if the controller supports > queueing. > > But this is *not* what we can do at user level under Unix > with any number of dups and O_ASYNC in a single process > [actually O_ASYNC is a misnomer; it should really be called > O_SIGIO or O_SIGNAL_ME_WHEN_IO_IS_POSSIBLE or something]. > Or have you guys added real asyncio to FreeBSD while I was > not looking? If so, that is great news!! Yes, this was predicated on having aioread/aiowrite/aiowait/aiocancel primitives available to the program, just like on a Sun machine or on SVR4. Sorry if this wasn't clear; you're right that O_ASYNC is a bogosity that is not useful. There was recently a discussion by someone on the hackers list who was implementing real async I/O. I had the primitives implemented in the 386BSD 0.1+patchkit 2 days when I reimplemented Sun LWP first on Sun's primitives so I didn't have to deal with register winodws and stack switching and later brought the code over to 386BSD. That code is totally totally useless now that we are onto 4.4 and major changes to the SCSI and other disk I/O have taken place. In reality, you want to be able to do this with an alternate gate to make *any* potentially blocking system call async -- the work to do that is BTSOTD (Beyond The Scope Of This Discussion) 8-). It's actually the work needed prior to kernel multithreading, and kernel preemption is the work needed before SMP. I didn't expect the developement to be done (at least not done on FreeBSD ) by say next Tuesday. 8-). > > Admittedly, the context switch issue is _relatively_ less of a > > problem (I won't say that it isn't a problem entirely). But > > there is also the issue of the token passing using pipes between > > the process, etc.. An implementation using aio in a single > > process context would avoid a lot of system call overhead as > > well as context switch overhead in the token passing, and also > > avoid things like pipe read/write latency, since team does NOT > > interleave that. > > team interleaves pipe IO *if* the incoming data or outgoing data > is via pipes. team can not interleave read/write of control > pipes because they are used for synchronization. The core > loop for a team member is something like this: [ ... ] > I grant your basic point that a single process implementation > will avoid a lot of context switch and syscall overhead. But > I just do not see how it can be done (even if we give up > on portability). Like I said, I grant your point of the current async I/O in BSD using O_ASYNC being simply insufficient. > Note that all that context switching + syscall overhead is a > problem only if the read/write quantum is so small that a > tape device can do it in a few milliseconds. I don't understand this statement; the write has to run to completion in the calling processes context to allow the ft to run with a safe margin. The problem is that the next write *isn't* started sufficiently soon, either because of data unavailability (which ft supposedly fixes) or because of no quantum for ft to do its fixing. It's this second case that wants the low overhead soloution. Even so, such a soloution is just a bandaid on the real problem which is the ft driver. It's interesting to consider clever ways to apply the bandaid, and to note that these uses are universally applicable performance winds for the problem they are really intended to solve (ie: the bandaid is a side effect). Which is what got us onto this tangent in the first place. 8-). > On my 25Mhz 486 ovehead of team is about 1.2ms *per* block > of data (and *regardless* of the number of team processes). > As long as your tape read/write takes atleast 10 times as > long, you should be pretty safe on a lightly loaded system. > I don't know the data rate of a QIC-80 but on a 4mm DAT that > translates to a blocksize of 250KB/s*12ms or about 3KBytes > on my machine! Yeah, the key here is system loading and competition of the ft program for process quantum because of increased load because of the compress. I think adding competing processes (the better to rob ft of needed quanta) by using team is probably not the way you really want to go anyway. I think other than some minor semantic issues, we are in violent agreement!. 8-). Terry Lambert terry@cs.weber.edu --- Any opinions in this posting are my own and not those of my present or previous employers.