Date: Mon, 25 Oct 1999 12:57:24 -0600 From: Nate Williams <nate@mt.sri.com> To: Warner Losh <imp@village.org> Cc: nate@mt.sri.com (Nate Williams), arch@freebsd.org Subject: Re: Racing interrupts Message-ID: <199910251857.MAA14453@mt.sri.com> In-Reply-To: <199910251850.MAA42106@harmony.village.org> References: <199910251827.MAA14189@mt.sri.com> <199910251646.KAA13773@mt.sri.com> <199910240608.AAA34462@harmony.village.org> <199910251822.MAA41899@harmony.village.org> <199910251850.MAA42106@harmony.village.org>
next in thread | previous in thread | raw e-mail | index | archive | help
> : Possibly, but the races still exist, and you can still get in a position > : where the hardware is gone. (I've verified this, having done alot of > : the work in the old pccard on suspend/resume.) > > OK. So there is a small window there, but nothing that can be counted > upon. One must therefore assume that the hardware is gone when the > interrupt comes in... Agreed. Sean also brought up the fact that it was necessary for Stratus to do this in the case of 'failed' hardware, which is a big deal not to hang your kernel when you're advertising yourself as fault-tolerant. :) > : It's certainly not impossible, but it does make the drivers that much > : more complex. And, (not to disagree with Sean), I don't see how you > : fix all the problems, simply because at some point you must assume the > : hardware exists, and if it disappears in the middle of an operation > : without any way of knowing that it's gone, how can you recover from it? > > Yes. W/o explicit checks for 'am I gone' it is very hard, and where > do you make them, and there is still a tiny race between the checking > for am I gone and the touching of hardware. These races can be made > so small as to be hard to lose. The 'am I gone' race is a big one (IMO), and instead of trying to minimize that race (which I don't think we can minimize much at all), I think the solution (which is much more complex) is to re-write the device drivers to never do busy-wait loops, never-ending timeouts, etc... Unfortunately, this may require changes to some basic FreeBSD assumptions (timeouts in particular). Do tsleep/wakeup provide for a 'default' timeout and notification? > That's one reason I think that having > some way to terminate the current thread of execution at any > instruction with a simple callback saying, "I killed your driver > thread, cope with the loss of hardware" is about as good as we're > going to get. This requiers changes to all drivers to not expect that a piece of hardware exists. And, if the thread is never given the indication that the hardware is gone (think fast interrupts), it still must deal with the fact that the hardware *may* be gone. It would also have a nice side-effect of making FreeBSD much more tolerant of failing hardware, although I'm not sure we would need to go the the lengths that a company like Stratus does. They don't have to support the wide variety of hardware that FreeBSD does. > : When someone removes the bridge away from you while you're walking > : across the chasm, how can you be expected to 'recover' from it? ;) > > By hanging onto the bridge :-) Or registering a SIGBRIDGE handler and > hoping that the 'chute deploys in time :-). That implies that you are informed that the bridge is gone, instead of finding out about it from striking the ground w/out a net. :( Nate To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199910251857.MAA14453>
