From owner-freebsd-hackers@FreeBSD.ORG  Wed Dec 20 12:49:28 2006
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
X-Original-To: freebsd-hackers@freebsd.org
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 0C00116A403
	for <freebsd-hackers@freebsd.org>; Wed, 20 Dec 2006 12:49:28 +0000 (UTC)
	(envelope-from asmrookie@gmail.com)
Received: from ug-out-1314.google.com (ug-out-1314.google.com [66.249.92.174])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 3F0AD43CBC
	for <freebsd-hackers@freebsd.org>; Wed, 20 Dec 2006 12:48:52 +0000 (GMT)
	(envelope-from asmrookie@gmail.com)
Received: by ug-out-1314.google.com with SMTP id o2so1917376uge
	for <freebsd-hackers@freebsd.org>; Wed, 20 Dec 2006 04:48:47 -0800 (PST)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com;
	h=received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth;
	b=rt2hHfJmmnIklhaeLOsDUyvi+eTQrI2+raQUGEU+CHYRrGtDcySPZRrV+EBxayW12b6hgDLGp2w11lEKPKLJQV9k6h9gKhiV8JL/9coG9QOUDQ19F1MpxHsbx4tptCIpXS4g/X7wldO6oSDaZL0nnU3ZSUXCMznYlWgpqh64ZqA=
Received: by 10.82.113.6 with SMTP id l6mr1457173buc.1166616875268;
	Wed, 20 Dec 2006 04:14:35 -0800 (PST)
Received: by 10.82.178.4 with HTTP; Wed, 20 Dec 2006 04:14:35 -0800 (PST)
Message-ID: <3bbf2fe10612200414j4c1c01ecr7b37e956b70b01fa@mail.gmail.com>
Date: Wed, 20 Dec 2006 13:14:35 +0100
From: "Attilio Rao" <attilio@freebsd.org>
Sender: asmrookie@gmail.com
To: "Duane Whitty" <duane@dwlabs.ca>
In-Reply-To: <20061220041843.GA10511@dwpc.dwlabs.ca>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
References: <20061220041843.GA10511@dwpc.dwlabs.ca>
X-Google-Sender-Auth: cff44e312d2a7a84
Cc: freebsd-hackers@freebsd.org
Subject: Re: Locking fundamentals
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 20 Dec 2006 12:49:28 -0000

2006/12/20, Duane Whitty <duane@dwlabs.ca>:
> Hello again,
>
> It seems to me that understanding locking holds the key to
> understanding fbsd internals.
>
> Could someone review my understanding of fbsd locking fundamentals.
> (No assertions here, just questions)
>
>     lock_mgr
> --------------------
>  mutexes|sx_lock
> -------------------    ^
> atomic | mem barriers  |

Our current locking hierarchy is basically different:

III level: lockmgr - sema - sx
II level: mutex (sleep/spin/pool) - rwlock - refcount - cv - msleep
I level: atomic instructions - memory barriers - sleepqueues/turnstiles

(a lower lever means that the upper layer primitives use it as a base.
ie: sx locks are build using 1 pool
mutex and 2 condition variables).

This scheme is far from being perfect due to the presence of 'level 3
primitives' which should never exist.
Currently, there is an ongoing efforts to take all the top layer
primitives to the level II.

On the other side, level I primitives should never be used directly by
kernel code, but should only be used as a bottom layer for
syncronizing primitives. All you need to care is in the layer 2 and 3
(and possibly should switch to layer 2).

> Don't lock if you don't need to.
> Lock only what you need to.
> Use the simplest lock that gets the job done.
> Don't drop locks prematurely because acquiring locks is expensive.
> When possible sleep rather than spin.
>
> ??????
> Memory barriers order operations
> Atomic operations complete without being interrupted
>
> Atomic operations and memory barriers are the primitives.
>
> Mutexes are implemented by atomic operations and memory barriers.
> Mutexes are relatively simple and inexpensive but may not recurse.
>
> Shared/exclusive locks are more versatile than mutexes in that they
> may be upgraded or downgraded from or to shared/exclusive and they
> may be acquired recursively.  More expensive than mutexes.
>
> lock_mgr locks are used when reference counting is needed
> ?????

We have a lot of different locks, beacause they are thought to be
optimized for every particular situation.
You have to divide our syncronizing primitives into 2 great family:
which can be held across sleeps and which can't.

The former family syncronizing primitives put threads sleeping through
the sleepqueue interface, which is tought to cater asyncronous events
in particular (ie: condition variables). Otherwise, the latter allow
the thread to block through the turnstile interface and is tought to
cater syncronous events (ie: mutex).

spin mutex and refcount excape this scheme, beacause of their
completely different nature (spin mutex just allow spinning of waiters
while refcount is serialized through a memory barrier/atomic
instruction).

Going into the specific:

mutex (sleep/pool) -> block
rwlock -> block
cv -> sleep
msleep -> sleep
lockmgr -> sleep beacause of msleep
sema -> sleep beacause of cv
sx -> sleep beacause of cv

Your code should really use blocking primitives, but often this is not
possible due to the nature of what are you doing (possible sleeps), so
you are force to use sleeping primitives.

Some tips:
- sx and rwlock do about the same thing, the main difference is that
one sleeps (sx) while the other block (rw)
- sema should be avoided to be used due to the sub-optimal
implementation used for it
- lockmgr is basically important for some special feature it has
(LK_DRAIN and the interlock passing)
but its sub-optimal and messy implementation would let it stay away
from being popular
- you should never use spin mutex, except when you are forced (fast interrupts).
- you should really use refcount interface when needed due to its
optimal implementation

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein