Date: Fri, 04 May 2007 20:26:52 -0600 From: Scott Long <scottl@samsco.org> To: John Baldwin <jhb@freebsd.org> Cc: attilio@freebsd.org, freebsd-current@freebsd.org, Harald Schmalzbauer <h.schmalzbauer@omnisec.de> Subject: Re: PANIC: blockable slep lock (sx) msi @ ....msi.c:374 Message-ID: <463BEB6C.90507@samsco.org> In-Reply-To: <200705041748.56842.jhb@freebsd.org> References: <463B7A1D.6020602@omnisec.de> <200705041637.38955.jhb@freebsd.org> <463B9C7E.2080901@samsco.org> <200705041748.56842.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
John Baldwin wrote: >> Well, you were just using it as a hack around a WITNESS warning ;-) I >> think it's OK for memory allocations to fail in this kind of code, so >> long as the failure is propagated to the caller. > > Do you really expect bus_alloc_resource()-type things to fail to attach a > driver instead of waiting for the system to free up some memory? Most of > that sort of thing is quite resilient right now, and I'm hesitant to make the > system start breaking things instead of waiting when memory runs low. > That's actually a very good question. Most core newbus and resource list functions already fail for lack of memory, so any guarantees about device device discovery, probe, and attach are already inconsistent. However, making guarantees is perfectly fine, I don't have a problem with that. But along with that, clients need to know what to expect from these utility/infrastructure subsystems in terms of locking and blocking. These subsystems also need to be conscious of being as consistent and easy to work with as possible, in line with the guarantees that are offered. The panic that you introduced is a perfect example of what happens when there aren't clear definitions and guarantees, and that's what I'm most concerned about fixing. I think it's good practice to have these subsystems do the following: 1. Do not hold private locks over calls to client subsystems. 2. Avoid blocking and sleeping except where specifically designed and documented to do so. 3. Report all errors up to the caller, avoiding panics where possible. If you think of kernel infrastructure as being similar to userland libraries, where both provide services and utilities to client code, then these rules make a lot of sense. As Giant gets removed from more of this code, better care and planning gets much more important. Scott
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?463BEB6C.90507>