Date: Wed, 27 Sep 2000 14:33:18 +1100 From: Greg Lehey <grog@lemis.com> To: Alfred Perlstein <bright@wintelcom.net> Cc: Matt Dillon <dillon@earth.backplane.com>, Daniel Eischen <eischen@vigrid.com>, John Polstra <jdp@polstra.com>, arch@FreeBSD.ORG Subject: Re: Mutexes and semaphores Message-ID: <20000927143318.H7583@sydney.worldwide.lemis.com> In-Reply-To: <20000925143853.J9141@fw.wintelcom.net>; from bright@wintelcom.net on Mon, Sep 25, 2000 at 02:38:54PM -0700 References: <Pine.SUN.3.91.1000925055843.15658A-100000@pcnet1.pcnet.com> <200009252123.e8PLN5F84806@earth.backplane.com> <20000925143853.J9141@fw.wintelcom.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Monday, 25 September 2000 at 14:38:54 -0700, Alfred Perlstein wrote: > * Matt Dillon <dillon@earth.backplane.com> [000925 14:23] wrote: >>> >>> Mutexes should protect data. If you want to allow recursive ownership of >>> data, then keep your own owner and ref count field in the protected data >>> and use the mutex properly (release it after setting the owner or >>> incrementing the ref count). You don't need to hold the mutex, and >>> now you can use the same mutex for msleep/cv_wait. >>> >>> -- >>> Dan Eischen >> >> Mutexes protect data *CONSISTENCY*, not data. There is a big difference. >> Probably 95% of the kernel assumes data consistency throughout any given >> routine. If that routine must call other routines (and most do), then >> you have a major issue to contend with in regards to how to maintain >> consistency across the call. >> >> There are several ways to deal with it: >> >> * The subroutine calls are not allowed to block - lots of examples of >> this in the VM and other subsystems. >> >> * You use a heavy-weight lock instead of a mutex - an example >> of this would be the VFS subsystem (vnode locks). >> >> * You engineer the code to allow data to change out from under >> it at certain points (such as when something blocks) - probably >> the best example is vm_fault in the VM subsystem. >> >> Unfortunately, all but the first can lead to serious bugs. Consider >> how many bugs have been fixed in the VFS and VM subsystems just in the >> last year that have been related to data consistency issues and you'll >> understand. >> >> The first issue - not allowing a subroutine call to block, when such a >> case exists, is the perfect place to put a recursive mutex. If you don't >> use a recursive mutex at that point then you wind up having to >> reengineer and rewrite big pieces of the code, or you wind up writing >> lots of little tag routines to do end-runs around the mutexes or to >> pass a flag that indicates that the mutex is already held and should >> not be obtained again, and so forth. >> >> Remember, I'm not talking about subsystem A calling subsystem B here, >> I'm talking about subsystem A calling itself. That is, a situation >> where you are not obtaining several different mutexes but are instead >> obtaining the same mutex several times. >> >> Frankly, fewer bugs will be introduced into the code by avoiding the >> reengineering and using recursive mutexes at appropriate points. > > What's pissing me off here (not to pick on you Matt) is that there's > honestly a lot of code to be worked on where the locking issues are > pretty simple (expecially when you look at how BSD/os implemented > it). Hmm. I was firmly in the "recursion is sloppiness" camp, but after reading this thread I'm no longer so convinced. I need to think about it. But showing examples where it makes sense doesn't mean it makes sense everywhere, and I would at least say "unnecessary recursion is sloppiness". I think you're looking at the unnecessary cases. > We should be coding and discussing existing problems with making the > kernel MPsafe instead of what me *might* come across along the road. I certainly think that at the moment we should be thinking about structure rather than details. > Whatever we bump into we can always beat to a pulp using lockmgr. :) Well, can anybody put up good arguments for keeping lockmgr in the long term? I'm not saying there aren't any, but I haven't analysed it enough yet. > And honestly, I don't like the idea of recursive mutexes, I'd rather > have a super function that locks a pgrp like > pg_signal_locked/_unlocked which expects the locks to be held rather > than a recursive lock. I think that eliminating recursion requires you to understand the system much better, which brings both advantages and disadvantages. Greg -- Finger grog@lemis.com for PGP public key See complete headers for address and phone numbers To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20000927143318.H7583>