From owner-freebsd-arch Tue Sep 26 20:37:33 2000 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E96F237B424 for ; Tue, 26 Sep 2000 20:37:23 -0700 (PDT) Received: from sydney.worldwide.lemis.com (asbestos.linuxcare.com.au [203.17.0.30]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7EA9D6E2BE9 for ; Tue, 26 Sep 2000 20:37:12 -0700 (PDT) Received: (from grog@localhost) by sydney.worldwide.lemis.com (8.9.3/8.9.3) id OAA08246; Wed, 27 Sep 2000 14:33:18 +1100 (EST) (envelope-from grog) Date: Wed, 27 Sep 2000 14:33:18 +1100 From: Greg Lehey To: Alfred Perlstein Cc: Matt Dillon , Daniel Eischen , John Polstra , arch@FreeBSD.ORG Subject: Re: Mutexes and semaphores Message-ID: <20000927143318.H7583@sydney.worldwide.lemis.com> References: <200009252123.e8PLN5F84806@earth.backplane.com> <20000925143853.J9141@fw.wintelcom.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2i In-Reply-To: <20000925143853.J9141@fw.wintelcom.net>; from bright@wintelcom.net on Mon, Sep 25, 2000 at 02:38:54PM -0700 Organization: LEMIS, PO Box 460, Echunga SA 5153, Australia Phone: +61-8-8388-8286 Fax: +61-8-8388-8725 Mobile: +61-418-838-708 WWW-Home-Page: http://www.lemis.com/~grog X-PGP-Fingerprint: 6B 7B C3 8C 61 CD 54 AF 13 24 52 F8 6D A4 95 EF Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Monday, 25 September 2000 at 14:38:54 -0700, Alfred Perlstein wrote: > * Matt Dillon [000925 14:23] wrote: >>> >>> Mutexes should protect data. If you want to allow recursive ownership of >>> data, then keep your own owner and ref count field in the protected data >>> and use the mutex properly (release it after setting the owner or >>> incrementing the ref count). You don't need to hold the mutex, and >>> now you can use the same mutex for msleep/cv_wait. >>> >>> -- >>> Dan Eischen >> >> Mutexes protect data *CONSISTENCY*, not data. There is a big difference. >> Probably 95% of the kernel assumes data consistency throughout any given >> routine. If that routine must call other routines (and most do), then >> you have a major issue to contend with in regards to how to maintain >> consistency across the call. >> >> There are several ways to deal with it: >> >> * The subroutine calls are not allowed to block - lots of examples of >> this in the VM and other subsystems. >> >> * You use a heavy-weight lock instead of a mutex - an example >> of this would be the VFS subsystem (vnode locks). >> >> * You engineer the code to allow data to change out from under >> it at certain points (such as when something blocks) - probably >> the best example is vm_fault in the VM subsystem. >> >> Unfortunately, all but the first can lead to serious bugs. Consider >> how many bugs have been fixed in the VFS and VM subsystems just in the >> last year that have been related to data consistency issues and you'll >> understand. >> >> The first issue - not allowing a subroutine call to block, when such a >> case exists, is the perfect place to put a recursive mutex. If you don't >> use a recursive mutex at that point then you wind up having to >> reengineer and rewrite big pieces of the code, or you wind up writing >> lots of little tag routines to do end-runs around the mutexes or to >> pass a flag that indicates that the mutex is already held and should >> not be obtained again, and so forth. >> >> Remember, I'm not talking about subsystem A calling subsystem B here, >> I'm talking about subsystem A calling itself. That is, a situation >> where you are not obtaining several different mutexes but are instead >> obtaining the same mutex several times. >> >> Frankly, fewer bugs will be introduced into the code by avoiding the >> reengineering and using recursive mutexes at appropriate points. > > What's pissing me off here (not to pick on you Matt) is that there's > honestly a lot of code to be worked on where the locking issues are > pretty simple (expecially when you look at how BSD/os implemented > it). Hmm. I was firmly in the "recursion is sloppiness" camp, but after reading this thread I'm no longer so convinced. I need to think about it. But showing examples where it makes sense doesn't mean it makes sense everywhere, and I would at least say "unnecessary recursion is sloppiness". I think you're looking at the unnecessary cases. > We should be coding and discussing existing problems with making the > kernel MPsafe instead of what me *might* come across along the road. I certainly think that at the moment we should be thinking about structure rather than details. > Whatever we bump into we can always beat to a pulp using lockmgr. :) Well, can anybody put up good arguments for keeping lockmgr in the long term? I'm not saying there aren't any, but I haven't analysed it enough yet. > And honestly, I don't like the idea of recursive mutexes, I'd rather > have a super function that locks a pgrp like > pg_signal_locked/_unlocked which expects the locks to be held rather > than a recursive lock. I think that eliminating recursion requires you to understand the system much better, which brings both advantages and disadvantages. Greg -- Finger grog@lemis.com for PGP public key See complete headers for address and phone numbers To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message