From owner-freebsd-arch  Mon Sep 25 14:23:26 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from earth.backplane.com (placeholder-dcat-1076843290.broadbandoffice.net [64.47.83.26])
	by hub.freebsd.org (Postfix) with ESMTP id 43DAE37B42C
	for <arch@FreeBSD.ORG>; Mon, 25 Sep 2000 14:23:22 -0700 (PDT)
Received: (from dillon@localhost)
	by earth.backplane.com (8.11.0/8.9.3) id e8PLN5F84806;
	Mon, 25 Sep 2000 14:23:05 -0700 (PDT)
	(envelope-from dillon)
Date: Mon, 25 Sep 2000 14:23:05 -0700 (PDT)
From: Matt Dillon <dillon@earth.backplane.com>
Message-Id: <200009252123.e8PLN5F84806@earth.backplane.com>
To: Daniel Eischen <eischen@vigrid.com>
Cc: John Polstra <jdp@polstra.com>, arch@FreeBSD.ORG
Subject: Re: Mutexes and semaphores
References:  <Pine.SUN.3.91.1000925055843.15658A-100000@pcnet1.pcnet.com>
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

:
:Mutexes should protect data.  If you want to allow recursive ownership of
:data, then keep your own owner and ref count field in the protected data
:and use the mutex properly (release it after setting the owner or 
:incrementing the ref count).  You don't need to hold the mutex, and
:now you can use the same mutex for msleep/cv_wait.
:
:-- 
:Dan Eischen

    Mutexes protect data *CONSISTENCY*, not data.  There is a big difference.
    Probably 95% of the kernel assumes data consistency throughout any given
    routine.  If that routine must call other routines (and most do), then
    you have a major issue to contend with in regards to how to maintain
    consistency across the call.

    There are several ways to deal with it:

	* The subroutine calls are not allowed to block - lots of examples of
	  this in the VM and other subsystems.

	* You use a heavy-weight lock instead of a mutex - an example
	  of this would be the VFS subsystem (vnode locks).

	* You engineer the code to allow data to change out from under
	  it at certain points (such as when something blocks) - probably
	  the best example is vm_fault in the VM subsystem.

    Unfortunately, all but the first can lead to serious bugs.  Consider
    how many bugs have been fixed in the VFS and VM subsystems just in the
    last year that have been related to data consistency issues and you'll
    understand.

    The first issue - not allowing a subroutine call to block, when such a
    case exists, is the perfect place to put a recursive mutex.  If you don't
    use a recursive mutex at that point then you wind up having to 
    reengineer and rewrite big pieces of the code, or you wind up writing
    lots of little tag routines to do end-runs around the mutexes or to
    pass a flag that indicates that the mutex is already held and should
    not be obtained again, and so forth.  

    Remember, I'm not talking about subsystem A calling subsystem B here,
    I'm talking about subsystem A calling itself.  That is, a situation
    where you are not obtaining several different mutexes but are instead
    obtaining the same mutex several times.

    Frankly, fewer bugs will be introduced into the code by avoiding the
    reengineering and using recursive mutexes at appropriate points.

					-Matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message