Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 27 Sep 2000 14:33:18 +1100
From:      Greg Lehey <grog@lemis.com>
To:        Alfred Perlstein <bright@wintelcom.net>
Cc:        Matt Dillon <dillon@earth.backplane.com>, Daniel Eischen <eischen@vigrid.com>, John Polstra <jdp@polstra.com>, arch@FreeBSD.ORG
Subject:   Re: Mutexes and semaphores
Message-ID:  <20000927143318.H7583@sydney.worldwide.lemis.com>
In-Reply-To: <20000925143853.J9141@fw.wintelcom.net>; from bright@wintelcom.net on Mon, Sep 25, 2000 at 02:38:54PM -0700
References:  <Pine.SUN.3.91.1000925055843.15658A-100000@pcnet1.pcnet.com> <200009252123.e8PLN5F84806@earth.backplane.com> <20000925143853.J9141@fw.wintelcom.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Monday, 25 September 2000 at 14:38:54 -0700, Alfred Perlstein wrote:
> * Matt Dillon <dillon@earth.backplane.com> [000925 14:23] wrote:
>>>
>>> Mutexes should protect data.  If you want to allow recursive ownership of
>>> data, then keep your own owner and ref count field in the protected data
>>> and use the mutex properly (release it after setting the owner or
>>> incrementing the ref count).  You don't need to hold the mutex, and
>>> now you can use the same mutex for msleep/cv_wait.
>>>
>>> --
>>> Dan Eischen
>>
>>     Mutexes protect data *CONSISTENCY*, not data.  There is a big difference.
>>     Probably 95% of the kernel assumes data consistency throughout any given
>>     routine.  If that routine must call other routines (and most do), then
>>     you have a major issue to contend with in regards to how to maintain
>>     consistency across the call.
>>
>>     There are several ways to deal with it:
>>
>> 	* The subroutine calls are not allowed to block - lots of examples of
>> 	  this in the VM and other subsystems.
>>
>> 	* You use a heavy-weight lock instead of a mutex - an example
>> 	  of this would be the VFS subsystem (vnode locks).
>>
>> 	* You engineer the code to allow data to change out from under
>> 	  it at certain points (such as when something blocks) - probably
>> 	  the best example is vm_fault in the VM subsystem.
>>
>>     Unfortunately, all but the first can lead to serious bugs.  Consider
>>     how many bugs have been fixed in the VFS and VM subsystems just in the
>>     last year that have been related to data consistency issues and you'll
>>     understand.
>>
>>     The first issue - not allowing a subroutine call to block, when such a
>>     case exists, is the perfect place to put a recursive mutex.  If you don't
>>     use a recursive mutex at that point then you wind up having to
>>     reengineer and rewrite big pieces of the code, or you wind up writing
>>     lots of little tag routines to do end-runs around the mutexes or to
>>     pass a flag that indicates that the mutex is already held and should
>>     not be obtained again, and so forth.
>>
>>     Remember, I'm not talking about subsystem A calling subsystem B here,
>>     I'm talking about subsystem A calling itself.  That is, a situation
>>     where you are not obtaining several different mutexes but are instead
>>     obtaining the same mutex several times.
>>
>>     Frankly, fewer bugs will be introduced into the code by avoiding the
>>     reengineering and using recursive mutexes at appropriate points.
>
> What's pissing me off here (not to pick on you Matt) is that there's
> honestly a lot of code to be worked on where the locking issues are
> pretty simple (expecially when you look at how BSD/os implemented
> it).

Hmm.  I was firmly in the "recursion is sloppiness" camp, but after
reading this thread I'm no longer so convinced.  I need to think about
it.  But showing examples where it makes sense doesn't mean it makes
sense everywhere, and I would at least say "unnecessary recursion is
sloppiness".  I think you're looking at the unnecessary cases.

> We should be coding and discussing existing problems with making the
> kernel MPsafe instead of what me *might* come across along the road.

I certainly think that at the moment we should be thinking about
structure rather than details.

> Whatever we bump into we can always beat to a pulp using lockmgr. :)

Well, can anybody put up good arguments for keeping lockmgr in the
long term?  I'm not saying there aren't any, but I haven't analysed it
enough yet.

> And honestly, I don't like the idea of recursive mutexes, I'd rather
> have a super function that locks a pgrp like
> pg_signal_locked/_unlocked which expects the locks to be held rather
> than a recursive lock.

I think that eliminating recursion requires you to understand the
system much better, which brings both advantages and disadvantages.

Greg
--
Finger grog@lemis.com for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20000927143318.H7583>