From owner-freebsd-arch  Sun Sep 24  3:31:38 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from Awfulhak.org (tun.AwfulHak.org [194.242.139.173])
	by hub.freebsd.org (Postfix) with ESMTP
	id 97B8437B422; Sun, 24 Sep 2000 03:31:31 -0700 (PDT)
Received: from hak.lan.Awfulhak.org (root@hak.lan.awfulhak.org [172.16.0.12])
	by Awfulhak.org (8.11.0/8.11.0) with ESMTP id e8OASfC46009;
	Sun, 24 Sep 2000 11:28:41 +0100 (BST)
	(envelope-from brian@hak.lan.Awfulhak.org)
Received: from hak.lan.Awfulhak.org (brian@localhost [127.0.0.1])
	by hak.lan.Awfulhak.org (8.11.0/8.11.0) with ESMTP id e8OAQVx26206;
	Sun, 24 Sep 2000 11:26:31 +0100 (BST)
	(envelope-from brian@hak.lan.Awfulhak.org)
Message-Id: <200009241026.e8OAQVx26206@hak.lan.Awfulhak.org>
X-Mailer: exmh version 2.1.1 10/15/1999
To: Greg Lehey <grog@wantadilla.lemis.com>
Cc: Chuck Paterson <cp@bsdi.com>, Archie Cobbs <archie@whistle.com>,
	Brian Somers <brian@Awfulhak.org>,
	Joerg Micheel <joerg@cs.waikato.ac.nz>,
	Matthew Jacob <mjacob@feral.com>, Frank Mayhar <frank@exit.com>,
	John Baldwin <jhb@pike.osd.bsdi.com>,
	Mark Murray <markm@freebsd.org>, FreeBSD-arch@freebsd.org,
	brian@Awfulhak.org
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files src/sys/sys random.h src/sys/dev/randomdev hash.c hash.h harvest.c randomdev.c yarrow.c yarro) 
In-Reply-To: Message from Greg Lehey <grog@wantadilla.lemis.com> 
   of "Sun, 24 Sep 2000 15:42:16 +0930." <20000924154216.D512@wantadilla.lemis.com> 
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Sun, 24 Sep 2000 11:26:31 +0100
From: Brian Somers <brian@Awfulhak.org>
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> 1.  Because "mutexes" (I really hate this term; I wish I could find a
>     better one) only have an implied count of one, they can also have
>     the concept of an owner, which we use.
> 
> 2.  Because the mutex has an owner, only the owner can release it.
> 
> 3.  The mutex can also be "recursive" (it's really iterative, I
>     suppose): the owner can take it several times.  The only reason
>     for this appears to be sloppy coding, but in the short term I
>     think we're agreed that we can't dispose of that.

I agree - the idea of recursive mutices evil and should go, but the 
idea of an owner should not.  It's nice to be able to write code that 
KASSERTs that it already owns a given mutex.

> Greg
> --
> Finger grog@lemis.com for PGP public key
> See complete headers for address and phone numbers

-- 
Brian <brian@Awfulhak.org>                        <brian@[uk.]FreeBSD.org>
      <http://www.Awfulhak.org>                   <brian@[uk.]OpenBSD.org>
Don't _EVER_ lose your sense of humour !


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sun Sep 24  4:50:41 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from flood.ping.uio.no (flood.ping.uio.no [129.240.78.31])
	by hub.freebsd.org (Postfix) with ESMTP id EF88B37B422
	for <arch@FreeBSD.ORG>; Sun, 24 Sep 2000 04:50:32 -0700 (PDT)
Received: (from des@localhost)
	by flood.ping.uio.no (8.9.3/8.9.3) id NAA48203;
	Sun, 24 Sep 2000 13:50:27 +0200 (CEST)
	(envelope-from des@ofug.org)
X-URL: http://www.ofug.org/~des/
X-Disclaimer: The views expressed in this message do not necessarily
  coincide with those of any organisation or company with
  which I am or have been affiliated.
To: Barry Pederson <bpederson@geocities.com>
Cc: arch@FreeBSD.ORG
Subject: Re: Snapshots in the Fast Filesystem
References: <200007060342.UAA23667@beastie.mckusick.com> <39CD0C1B.324AA1C5@geocities.com>
From: Dag-Erling Smorgrav <des@ofug.org>
Date: 24 Sep 2000 13:50:26 +0200
In-Reply-To: Barry Pederson's message of "Sat, 23 Sep 2000 15:01:31 -0500"
Message-ID: <xzp7l826mbh.fsf@flood.ping.uio.no>
Lines: 14
User-Agent: Gnus/5.0802 (Gnus v5.8.2) Emacs/20.4
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

Barry Pederson <bpederson@geocities.com> writes:
> Kirk gives the example of mounting a snapshot by using a 'vn0c' device -
> I was wondering if the 'c' part of those device names is significant? 

Yes. These files are raw FS images, not labeled slices, so the only
existing partition is 'c'.

> Could you mount additional snapshots using 'vn0a', 'vn0b' and so on?

No. One file, one device.

DES
-- 
Dag-Erling Smorgrav - des@ofug.org


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sun Sep 24  9:38:16 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from feral.com (feral.com [192.67.166.1])
	by hub.freebsd.org (Postfix) with ESMTP
	id 3958A37B424; Sun, 24 Sep 2000 09:38:14 -0700 (PDT)
Received: from bird (bird.feral.com [192.67.166.155])
	by feral.com (8.9.3/8.9.3) with ESMTP id JAA19290;
	Sun, 24 Sep 2000 09:37:30 -0700
Date: Sun, 24 Sep 2000 09:37:27 -0700 (PDT)
From: Matthew Jacob <mjacob@feral.com>
Reply-To: mjacob@feral.com
To: Brian Somers <brian@Awfulhak.org>
Cc: Greg Lehey <grog@wantadilla.lemis.com>,
	Chuck Paterson <cp@bsdi.com>, Archie Cobbs <archie@whistle.com>,
	Joerg Micheel <joerg@cs.waikato.ac.nz>,
	Frank Mayhar <frank@exit.com>, John Baldwin <jhb@pike.osd.bsdi.com>,
	Mark Murray <markm@freebsd.org>, FreeBSD-arch@freebsd.org
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files
 src/sys/sys random.h src/sys/dev/randomdev hash.c hash.h harvest.c randomdev.c
 yarrow.c yarro) 
In-Reply-To: <200009241026.e8OAQVx26206@hak.lan.Awfulhak.org>
Message-ID: <Pine.GSO.4.21.0009240936050.8550-100000@bird.feral.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


> I agree - the idea of recursive mutices evil and should go, but the 
> idea of an owner should not.  It's nice to be able to write code that 
> KASSERTs that it already owns a given mutex.

I'm  not sure I agree. Having lived through Solaris hell with recursive mutex
panics, I rather like the BSD/OS approach.

Yes, possibly allows for sloppy coding. If you get rid of this, though, you
can extend the switchover and pain for SMP at least a year.

-matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sun Sep 24 10:33: 8 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20])
	by hub.freebsd.org (Postfix) with ESMTP
	id 8012D37B422; Sun, 24 Sep 2000 10:33:03 -0700 (PDT)
Received: (from bright@localhost)
	by fw.wintelcom.net (8.10.0/8.10.0) id e8OHX3p27873;
	Sun, 24 Sep 2000 10:33:03 -0700 (PDT)
Date: Sun, 24 Sep 2000 10:33:03 -0700
From: Alfred Perlstein <bright@wintelcom.net>
To: arch@freebsd.org
Cc: cp@freebsd.org, bmilekic@freebsd.org
Subject: need advice, fsetown annoyances and mpsafeness.
Message-ID: <20000924103303.M9141@fw.wintelcom.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.4i
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

Take this scenario into account:

1) one sets a socket S for SIGIO delivery on an event to pid N
2) N exits
3) evetually before 'S' is destroyed a second process happens to
   get pid N.
4) an event happens on 'S' and the wrong process 'N' is notified.

Well this isn't possible in FreeBSD because we hang a struct sigio
off of the object that is going to be delivering signals as well
as the struct proc/pgrp that is to recieve them.

When the proc/pgrp is destroyed funsetownlst() is called on the
list of sigio structs hanging from the proc/pgrp.

What it then does is walk through the sigio structs hung from itself
and using a back-pointer that points to the pointer within the
object (socket/tty) it raises splhigh and NULLs it out, lowers spl,
then frees the sigio.

  	s = splhigh();
	*(sigio->sio_myref) = NULL;
	splx(s);

If an object is destroyed it is responsible for freeing the attached
sigio struct in nearly the same way... raising splhigh and delinking
itself from the list of sigios attached to the proc/pgrp

This is a problem because it's pretty complicated to make mpsafe.

Solutions come to mind:

1) embedding the sigio within the object.
   problems: structure bloat, not really sure if it helps
2) removing the burden of sigio destruction from the proc/pgrp destruction
   routines, instead the proc can just walk the sigios and set a
   flag is set such that the sigio is not to be delivered, it is
   then entirely up to the object (socket/tty) to free() the sigio.

   the sigio linked list manipulation can be hinged off the process
   mutex we will need to add to the proc and pgrp structures.

   if a sigio is going to be changed you must aquire the proc/pgrp lock
   of the process/group you are removing the structure from before
   doing the unlinking and change otherwise you race against process
   exit.

Option 2 seems a lot clearer to me and it also seems to address all
the problems here without any hackish like solution

I'm going to be investigating the BSD/os way of handling this, but
it seems that they don't take into account for pid wraparound at
a glance.

Questions?  Comments?

I'm really looking for either, encouragement (hey '2' looks cool),
alternate locking suggestions, or a redesign of the sigio way that
is more mpsafe.

Is anyone else starting to hate monolithic kernel design? :)

thanks,
-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sun Sep 24 11: 8:32 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from prism.flugsvamp.com (cb58709-a.mdsn1.wi.home.com [24.17.241.9])
	by hub.freebsd.org (Postfix) with ESMTP
	id CFE8737B424; Sun, 24 Sep 2000 11:08:29 -0700 (PDT)
Received: (from jlemon@localhost)
	by prism.flugsvamp.com (8.11.0/8.11.0) id e8OI8fi18906;
	Sun, 24 Sep 2000 13:08:41 -0500 (CDT)
	(envelope-from jlemon)
Date: Sun, 24 Sep 2000 13:08:41 -0500
From: Jonathan Lemon <jlemon@flugsvamp.com>
To: Alfred Perlstein <bright@wintelcom.net>
Cc: arch@FreeBSD.ORG, cp@FreeBSD.ORG, bmilekic@FreeBSD.ORG
Subject: Re: need advice, fsetown annoyances and mpsafeness.
Message-ID: <20000924130841.A2487@prism.flugsvamp.com>
References: <20000924103303.M9141@fw.wintelcom.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 1.0pre2i
In-Reply-To: <20000924103303.M9141@fw.wintelcom.net>
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

On Sun, Sep 24, 2000 at 10:33:03AM -0700, Alfred Perlstein wrote:
> 2) removing the burden of sigio destruction from the proc/pgrp destruction
>    routines, instead the proc can just walk the sigios and set a
>    flag is set such that the sigio is not to be delivered, it is
>    then entirely up to the object (socket/tty) to free() the sigio.
> 
>    the sigio linked list manipulation can be hinged off the process
>    mutex we will need to add to the proc and pgrp structures.
> 
>    if a sigio is going to be changed you must aquire the proc/pgrp lock
>    of the process/group you are removing the structure from before
>    doing the unlinking and change otherwise you race against process
>    exit.
> 
> Option 2 seems a lot clearer to me and it also seems to address all
> the problems here without any hackish like solution
> 
> I'm going to be investigating the BSD/os way of handling this, but
> it seems that they don't take into account for pid wraparound at
> a glance.
> 
> Questions?  Comments?

kqueue has a similar problem, and resolves this in a similar fashion
as above.  A knote can be attached to a process, which may exit; in
this case, the process just walks down the list and sets a flag, the
structure is then destroyed when kevent gets around to examining it.
--
Jonathan


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sun Sep 24 11:33:29 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from wall.polstra.com (rtrwan160.accessone.com [206.213.115.74])
	by hub.freebsd.org (Postfix) with ESMTP id B9A1E37B422
	for <arch@freebsd.org>; Sun, 24 Sep 2000 11:33:25 -0700 (PDT)
Received: from vashon.polstra.com (vashon.polstra.com [206.213.73.13])
	by wall.polstra.com (8.9.3/8.9.3) with ESMTP id LAA10553
	for <arch@freebsd.org>; Sun, 24 Sep 2000 11:33:24 -0700 (PDT)
	(envelope-from jdp@polstra.com)
From: John Polstra <jdp@polstra.com>
Received: (from jdp@localhost)
	by vashon.polstra.com (8.9.3/8.9.1) id LAA00463;
	Sun, 24 Sep 2000 11:33:23 -0700 (PDT)
	(envelope-from jdp@polstra.com)
Date: Sun, 24 Sep 2000 11:33:23 -0700 (PDT)
Message-Id: <200009241833.LAA00463@vashon.polstra.com>
To: arch@freebsd.org
Reply-To: arch@freebsd.org
Subject: Re: Mutexes and semaphores
In-Reply-To: <200009241026.e8OAQVx26206@hak.lan.Awfulhak.org>
References: <200009241026.e8OAQVx26206@hak.lan.Awfulhak.org>
Organization: Polstra & Co., Seattle, WA
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

In article <200009241026.e8OAQVx26206@hak.lan.Awfulhak.org>,
Brian Somers  <brian@Awfulhak.org> wrote:
> > 3.  The mutex can also be "recursive" (it's really iterative, I
> >     suppose): the owner can take it several times.  The only reason
> >     for this appears to be sloppy coding, but in the short term I
> >     think we're agreed that we can't dispose of that.
> 
> I agree - the idea of recursive mutices evil and should go, but the 
> idea of an owner should not.  It's nice to be able to write code that 
> KASSERTs that it already owns a given mutex.

I disagree that recursive mutexes are bad, and I don't think "sloppy
coding" is the right way to look at them.  I would argue that
recursive mutexes allow robust code to be written based solely on
knowledge of the immediately surrounding code, and that is a Good
Thing.

There are plenty of reasonable situations where you have a block of
code (say, a function) and a certain mutex needs to be locked while
it executes.  The function might be called from several different
places.  Maybe all of the call sites already hold the mutex, and
maybe they don't.  Maybe it is hard to say for sure.  Maybe new calls
will be added in the future which will add further uncertainty.  With
recursive mutexes you can make the code robust by locking the mutex
inside the called function.  This robustness is certain and it is
independent of what is going on in the rest of the system.

Just look at the traditional kernel with respect to the spl*() calls.
Imagine if it were illegal to call an spl function which would block
one or more interrupts which were already blocked.  That kind of
restriction would make the code must less robust and much harder to
maintain.

There is a place for both recursive and non-recursive mutexes in a
sound and robust design.

John
-- 
  John Polstra                                               jdp@polstra.com
  John D. Polstra & Co., Inc.                        Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sun Sep 24 11:34:14 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from falla.videotron.net (falla.videotron.net [205.151.222.106])
	by hub.freebsd.org (Postfix) with ESMTP
	id D758A37B422; Sun, 24 Sep 2000 11:34:11 -0700 (PDT)
Received: from modemcable136.203-201-24.mtl.mc.videotron.ca ([24.201.203.136])
 by falla.videotron.net (Sun Internet Mail Server sims.3.5.1999.12.14.10.29.p8)
 with ESMTP id <0G1E00MDVM8XXY@falla.videotron.net>; Sun, 24 Sep 2000 14:34:10 -0400 (EDT)
Date: Sun, 24 Sep 2000 14:37:52 -0400 (EDT)
From: Bosko Milekic <bmilekic@technokratis.com>
Subject: Re: need advice, fsetown annoyances and mpsafeness.
In-reply-to: <20000924103303.M9141@fw.wintelcom.net>
To: Alfred Perlstein <bright@wintelcom.net>
Cc: arch@FreeBSD.ORG, cp@FreeBSD.ORG
Message-id: <Pine.BSF.4.21.0009241420280.14398-100000@jehovah.technokratis.com>
MIME-version: 1.0
Content-type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


On Sun, 24 Sep 2000, Alfred Perlstein wrote:

> What it then does is walk through the sigio structs hung from itself
> and using a back-pointer that points to the pointer within the
> object (socket/tty) it raises splhigh and NULLs it out, lowers spl,
> then frees the sigio.
> 
>   	s = splhigh();
> 	*(sigio->sio_myref) = NULL;
> 	splx(s);

	Why can't this be done with an atomic operation? If you're holding
  the sigio struct, then are you not also ensuring that sigio->sio_myref
  won't change. Setting the pointer within the object to NULL should be
  atomic in itself, AFAIK. I'm wondering what would happen if the object is
  destroyed just before you splhigh() up there (in other words, did you
  leave something out of the example you posted above?)
	Assuming something was left out, then I'm wondering if it would be
  profitable in this case to distinguish between the nature of the object
  and optionally provide a pointer to a mutex in the sigio struct which
  should be aquired in order to do this manipulation.

> thanks,
> -- 
> -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
> "I have the heart of a child; I keep it in a jar on my desk."

  Cheers,

  Bosko Milekic
  bmilekic@technokratis.com


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sun Sep 24 11:45:38 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from critter.freebsd.dk (flutter.freebsd.dk [212.242.40.147])
	by hub.freebsd.org (Postfix) with ESMTP id 222FB37B424
	for <arch@FreeBSD.ORG>; Sun, 24 Sep 2000 11:45:34 -0700 (PDT)
Received: from critter (localhost [127.0.0.1])
	by critter.freebsd.dk (8.11.0/8.9.3) with ESMTP id e8OIjWN31096
	for <arch@FreeBSD.ORG>; Sun, 24 Sep 2000 20:45:32 +0200 (CEST)
	(envelope-from phk@critter.freebsd.dk)
To: arch@FreeBSD.ORG
Subject: Re: Mutexes and semaphores 
In-Reply-To: Your message of "Sun, 24 Sep 2000 11:33:23 PDT."
             <200009241833.LAA00463@vashon.polstra.com> 
Date: Sun, 24 Sep 2000 20:45:32 +0200
Message-ID: <31094.969821132@critter>
From: Poul-Henning Kamp <phk@critter.freebsd.dk>
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

In message <200009241833.LAA00463@vashon.polstra.com>, John Polstra writes:

>I disagree that recursive mutexes are bad, and I don't think "sloppy
>coding" is the right way to look at them.  I would argue that
>recursive mutexes allow robust code to be written based solely on
>knowledge of the immediately surrounding code, and that is a Good
>Thing.

<AOL>

--
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD coreteam member | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sun Sep 24 12:20:17 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from pcnet1.pcnet.com (pcnet1.pcnet.com [204.213.232.3])
	by hub.freebsd.org (Postfix) with ESMTP id 0047837B424
	for <arch@FreeBSD.ORG>; Sun, 24 Sep 2000 12:20:11 -0700 (PDT)
Received: (from eischen@localhost)
	by pcnet1.pcnet.com (8.8.7/PCNet) id PAA08469;
	Sun, 24 Sep 2000 15:19:55 -0400 (EDT)
Date: Sun, 24 Sep 2000 15:19:55 -0400 (EDT)
From: Daniel Eischen <eischen@vigrid.com>
To: arch@FreeBSD.ORG
Cc: arch@FreeBSD.ORG
Subject: Re: Mutexes and semaphores
In-Reply-To: <200009241833.LAA00463@vashon.polstra.com>
Message-ID: <Pine.SUN.3.91.1000924151251.7740A-100000@pcnet1.pcnet.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

On Sun, 24 Sep 2000, John Polstra wrote:
> In article <200009241026.e8OAQVx26206@hak.lan.Awfulhak.org>,
> Brian Somers  <brian@Awfulhak.org> wrote:
> > > 3.  The mutex can also be "recursive" (it's really iterative, I
> > >     suppose): the owner can take it several times.  The only reason
> > >     for this appears to be sloppy coding, but in the short term I
> > >     think we're agreed that we can't dispose of that.
> > 
> > I agree - the idea of recursive mutices evil and should go, but the 
> > idea of an owner should not.  It's nice to be able to write code that 
> > KASSERTs that it already owns a given mutex.
> 
> I disagree that recursive mutexes are bad, and I don't think "sloppy
> coding" is the right way to look at them.  I would argue that
> recursive mutexes allow robust code to be written based solely on
> knowledge of the immediately surrounding code, and that is a Good
> Thing.
> 
> There are plenty of reasonable situations where you have a block of
> code (say, a function) and a certain mutex needs to be locked while
> it executes.  The function might be called from several different
> places.  Maybe all of the call sites already hold the mutex, and
> maybe they don't.  Maybe it is hard to say for sure.  Maybe new calls
> will be added in the future which will add further uncertainty.  With
> recursive mutexes you can make the code robust by locking the mutex
> inside the called function.  This robustness is certain and it is
> independent of what is going on in the rest of the system.

But you can't then use a recursive mutex in conjunction with msleep
(cv_wait) which forces you to use yet another mutex.  This is fine,
but it adds confusion for the programmer.  Another thing, is in
our support for recursive mutexes is that they make the calling
conventions overly complex (with the silly flag argumuents to
mtx_enter()).

If we are going to support recursive mutex, I think it would be
better to add separate calls/macros/data types to support them,
so the the mtx mutexes can be simplified.  Calls to mtx_enter
with the recursive mutex type wouldn't even compile.

My $0.02 for what it's worth...

-- 
Dan Eischen


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sun Sep 24 12:50:16 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from berserker.bsdi.com (berserker.twistedbit.com [199.79.183.1])
	by hub.freebsd.org (Postfix) with ESMTP
	id 821F637B422; Sun, 24 Sep 2000 12:50:08 -0700 (PDT)
Received: from berserker.bsdi.com (cp@LOCALHOST [127.0.0.1])
	by berserker.bsdi.com (8.9.3/8.9.3) with ESMTP id NAA25438;
	Sun, 24 Sep 2000 13:48:45 -0600 (MDT)
Message-Id: <200009241948.NAA25438@berserker.bsdi.com>
To: Greg Lehey <grog@wantadilla.lemis.com>
Cc: Archie Cobbs <archie@whistle.com>,
	Brian Somers <brian@awfulhak.org>,
	Joerg Micheel <joerg@cs.waikato.ac.nz>,
	Matthew Jacob <mjacob@feral.com>, Frank Mayhar <frank@exit.com>,
	John Baldwin <jhb@pike.osd.bsdi.com>,
	Mark Murray <markm@freebsd.org>, FreeBSD-arch@freebsd.org
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files src/sys/sys random.h src/sys/dev/randomdev hash.c hash.h harvest.c randomdev.c yarrow.c yarro) 
In-reply-to: Your message of "Sun, 24 Sep 2000 15:42:16 +0930."
             <20000924154216.D512@wantadilla.lemis.com> 
From: Chuck Paterson <cp@bsdi.com>
Date: Sun, 24 Sep 2000 13:48:45 -0600
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


First a general comment. The main reason to not hold a mutex across
an async event is not because it won't work, but because it means
that we loose the ability to detect dead locks.  If process A holds
mutex bar during a wait for async event, such as msleep(), then it
becomes a requirment that the process which is going to wake up
process A doesn't block on mutex foo, or have any dependencies even
many removed on something that requires mutex bar.


Greg Lehey wrote on: Sun, 24 Sep 2000 15:42:16 +0930
}On Saturday, 23 September 2000 at 21:02:49 -0600, Chuck Paterson wrote:
}>
}>> Once you have the spin lock primitive, you can easily build
}>> semaphores, sleep queues, etc. A semaphore is just a counter plus
}>> a sleep queue -- all protected by the spin lock.
}>>
}>> A MUTEX is just a sepaphore whose initial count is 1.
}>>
}>> ??
}>
}> In general this might be true, but in specific it isn't. 
}
}As you know, I used to say exactly the same thing as Archie, but I've
}realized that this implied count of 1 causes a couple of important
}differences.  I'm still working on a clearer definition, but what I've
}seen so far is:
}
}1.  Because "mutexes" (I really hate this term; I wish I could find a
}    better one) only have an implied count of one, they can also have
}    the concept of an owner, which we use.
}
}2.  Because the mutex has an owner, only the owner can release it.
}
}3.  The mutex can also be "recursive" (it's really iterative, I
}    suppose): the owner can take it several times.  The only reason
}    for this appears to be sloppy coding, but in the short term I
}    think we're agreed that we can't dispose of that.
}

I have to disagree with item 3. Take the simple situation of function
a() needing lock foo and function b() needing lock foo. If b() is
some times called from a() and sometimes not then the recursiveness
of foo is saving state. The same state will have to be passed
explicitly and tested  b() in either case, all that
is really done is providing an automatic way of passing this
state in, and saving a few cycles because we don't have to
set up a variable and pass it in.

}One thing that I don't think is important is the duration of
}ownership.  We currently use mutexes for short periods of time, which
}is why we have the spin version.
}
}At Tandem, we only used semaphores, but they always had a count of 1,
}so they were effectively very close to our mutexes.  They didn't allow
}recursion, which is the Right Thing in a system designed from the
}ground up, but they also didn't have owners.  One of the most frequent
}complicated problems we had were system hangs (deadlocks), and we
}frequently couldn't figure out who had done what and why.  Having
}owners is a great debug aid.
}
}> The sleep version of mutexs have no spin lock. Spin locks are more
}> expensive than the mutices currently in FreeBSD and BSD/OS.  In
}> order to acquire a spin locks interrupts must be blocked, which
}> isn't the case for mutices which are not contested.
}

}If we can expect that the mutex will, on average, be freed in less
}time than it would take to schedule a new process, spin locks can be a
}better alternative.  Otherwise we wouldn't need them at all.
}

I think the previous graph is an over simplification. In
general the following is closer to metric for your suggestion is:

POC	percentage of acquisitons which have a conflict
CCS	average cost of context switch
AHT	average hold time
SLS	how much is saved acquiring a sleep lock instead of a spin lock

if ((CCS - (AHT / 2) * POC > SLS)	use spin lock


In the future when we have smarter code in the case where we have
a conflict then the percentage of time we pay the CCS will drop.

The place where spin locks are required is where a context
switch is not permissible.

}Anyway, this doesn't directly relate to semaphores.  We have the basic
}issue of atomicity, which in general can be handled without spin
}locks, and that would apply to semaphores just as much as to mutexes.
}
}Greg
}--
}Finger grog@lemis.com for PGP public key
}See complete headers for address and phone numbers

Chuck


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sun Sep 24 12:53:25 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20])
	by hub.freebsd.org (Postfix) with ESMTP
	id DB4A637B422; Sun, 24 Sep 2000 12:53:13 -0700 (PDT)
Received: (from bright@localhost)
	by fw.wintelcom.net (8.10.0/8.10.0) id e8OJrCI01203;
	Sun, 24 Sep 2000 12:53:12 -0700 (PDT)
Date: Sun, 24 Sep 2000 12:53:12 -0700
From: Alfred Perlstein <bright@wintelcom.net>
To: Bosko Milekic <bmilekic@technokratis.com>
Cc: arch@FreeBSD.ORG, cp@FreeBSD.ORG
Subject: Re: need advice, fsetown annoyances and mpsafeness.
Message-ID: <20000924125311.Q9141@fw.wintelcom.net>
References: <20000924103303.M9141@fw.wintelcom.net> <Pine.BSF.4.21.0009241420280.14398-100000@jehovah.technokratis.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.4i
In-Reply-To: <Pine.BSF.4.21.0009241420280.14398-100000@jehovah.technokratis.com>; from bmilekic@technokratis.com on Sun, Sep 24, 2000 at 02:37:52PM -0400
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

* Bosko Milekic <bmilekic@technokratis.com> [000924 11:34] wrote:
> 
> 
> On Sun, 24 Sep 2000, Alfred Perlstein wrote:
> 
> > What it then does is walk through the sigio structs hung from itself
> > and using a back-pointer that points to the pointer within the
> > object (socket/tty) it raises splhigh and NULLs it out, lowers spl,
> > then frees the sigio.
> > 
> >   	s = splhigh();
> > 	*(sigio->sio_myref) = NULL;
> > 	splx(s);
> 
> 	Why can't this be done with an atomic operation? If you're holding
>   the sigio struct, then are you not also ensuring that sigio->sio_myref
>   won't change. Setting the pointer within the object to NULL should be
>   atomic in itself, AFAIK. I'm wondering what would happen if the object is
>   destroyed just before you splhigh() up there (in other words, did you
>   leave something out of the example you posted above?)
> 	Assuming something was left out, then I'm wondering if it would be
>   profitable in this case to distinguish between the nature of the object
>   and optionally provide a pointer to a mutex in the sigio struct which
>   should be aquired in order to do this manipulation.

It's really a lot more evil than you think.

The race is in the object (socket/tty) checking the pointer and
then dereferencing it.

A broken solution is to lock the sigio struct or provide a backreference
to the socket/tty lock, after banging my head against my desk for some time I came across this solution:

(assuming pfind/pgfind return the proc/pgrp locked)

/*
 * called by the owner of a sigio struct such as a tty/socket to remove
 * a struct sigio from itself, called at object destruction or at the
 * the time that sigio/sigurg is no longer wanted/needed
 * it will lock and unlock the proc/pgrp target of the sigio
 */
void
funsetown_obj(sigio)
	struct sigio *sigio;
{
	pid_t	pid;

	if (sigio == NULL)
		return;

	/*
	 * ok this is somewhat tricky, we examine what the sigio is attached
	 * to, whatever it is proc/pgrp we need to use the search functions
	 * to ensure atomicity.  If we get back ESRCH that's ok, that means
	 * we lost the race, just free it.
	 * if we get back a pointer we then need to make sure that the pgid
	 * hasn't been NULLed out because we lost the race between looking
	 * at the sigio and locking the proc/pgrp
	 * (most likely pid/pgid wraparound)
	 */
	pid = sigio->sio_pgid;

	if (pid < 0) {
		struct pgrp *p;

		if ((pgrp = pgfind(pid)) != NULL) {
			/* funsetown_proc would have set this to zero */
			if (sigio->sio_pgid != 0)
				SLIST_REMOVE(&sigio->sio_pgrp->pg_sigiolst, sigio,
					sigio, sio_pgsigio);
			PGRP_UNLOCK(&sigio->sio_pgrp);
		}
	} else if (pid > 0) {
		struct proc *p;

		if ((p = pfind(pid)) != NULL) {
			/* funsetown_proc would have set this to zero */
			if (sigio->sio_pgid != 0)
				SLIST_REMOVE(&sigio->sio_proc->p_sigiolst, sigio,
					sigio, sio_pgsigio);
			PROC_UNLOCK(&sigio->sio_proc);
		}
	}

out:
	crfree(sigio->sio_ucred);
	FREE(sigio, M_SIGIO);
}

/*
 * NULL out a sigio struct attached to a process/pgrp
 * must be called with the object (struct proc/pgrp) locked
 * this is to be called from the perspective of the process/pgrp
 *
 * called from the proc/pgid at teardown
 * proc/pgid must be locked
 */
void
funsetown_proc(sigio)
	struct sigio *sigio;
{
	int s;

	if (sigio == NULL)
		return;
	if (sigio->sio_pgid < 0) {
		SLIST_REMOVE(&sigio->sio_pgrp->pg_sigiolst, sigio,
			     sigio, sio_pgsigio);
	} else /* if ((*sigiop)->sio_pgid > 0) */ {
		SLIST_REMOVE(&sigio->sio_proc->p_sigiolst, sigio,
			     sigio, sio_pgsigio);
	}
	sigio->sio_pgid = 0;
}

/*
 * Free a list of sigio structures.
 *
 * called from the proc/pgid at teardown
 * proc/pgid must be locked
 */
void
funsetownlst(sigiolst)
	struct sigiolst *sigiolst;
{
	struct sigio *sigio;

	while ((sigio = SLIST_FIRST(sigiolst)) != NULL)
		funsetown(sigio);
}


Questions?  Comments?

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sun Sep 24 13:33:54 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from fledge.watson.org (fledge.watson.org [204.156.12.50])
	by hub.freebsd.org (Postfix) with ESMTP id DD4BE37B424
	for <arch@freebsd.org>; Sun, 24 Sep 2000 13:33:51 -0700 (PDT)
Received: from fledge.watson.org (robert@fledge.pr.watson.org [192.0.2.3])
	by fledge.watson.org (8.9.3/8.9.3) with SMTP id QAA46561;
	Sun, 24 Sep 2000 16:33:33 -0400 (EDT)
	(envelope-from robert@fledge.watson.org)
Date: Sun, 24 Sep 2000 16:33:33 -0400 (EDT)
From: Robert Watson <rwatson@freebsd.org>
X-Sender: robert@fledge.watson.org
To: Barry Pederson <bpederson@geocities.com>
Cc: arch@freebsd.org
Subject: Re: Snapshots in the Fast Filesystem
In-Reply-To: <39CD0C1B.324AA1C5@geocities.com>
Message-ID: <Pine.NEB.3.96L.1000924162624.46412A-100000@fledge.watson.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


On Sat, 23 Sep 2000, Barry Pederson wrote:

> Is there (or will there be) some way to get a list of snapshots that
> have been created on a filesystem?  Kirk suggests following a convention
> for naming snapshot files, but if that doesn't happen for some reason,
> it would be good to have some foolproof way of determining what snaps
> exist.  Otherwise, I suppose you could search a filesystem for files
> that -appear- to be almost as large as the filesystem itself, but that
> seems kind of a kludge - and I don't know if I'd want to trust a script
> to interpret those results correctly.

I won't address the other issues discussed in your email, although I do
have some thoughts on them, but will address this one.  Snapshot files
have the SF_SNAPSHOT file flag set on them -- I believe this is not
cleared by ufs_getattr() and hence is probably exposed via stat().  I'm
not sure our ls -ol output understands the snapshot flag, but a custom
modification to ls, or a manual tool for stating and identifying files
with the flag set sounds like it should work.  That said, I haven't tried
this :-).

Given that snapshots should only be created by privileged users, hopefully
you won't have the opportunity to lose one.  I've been creating my
snapshots under /.snapshot on the file system, matching my /.attribute
file for extended attributes.  In future versions of snapshots, it might
be spiffy to expose mounted snapshots of directories under a .snapshot
directory in each subdirectory, in the style of NetApp.  You can certainly
imagine the current implementation permitting it, given sufficient boredom
on the part of Kirk.

  Robert N M Watson 

robert@fledge.watson.org              http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sun Sep 24 14: 1:58 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from Awfulhak.org (tun.AwfulHak.org [194.242.139.173])
	by hub.freebsd.org (Postfix) with ESMTP
	id E416B37B424; Sun, 24 Sep 2000 14:01:43 -0700 (PDT)
Received: from hak.lan.Awfulhak.org (root@hak.lan.awfulhak.org [172.16.0.12])
	by Awfulhak.org (8.11.0/8.11.0) with ESMTP id e8OKuvC15873;
	Sun, 24 Sep 2000 21:56:57 +0100 (BST)
	(envelope-from brian@hak.lan.Awfulhak.org)
Received: from hak.lan.Awfulhak.org (brian@localhost [127.0.0.1])
	by hak.lan.Awfulhak.org (8.11.0/8.11.0) with ESMTP id e8OKrJx29096;
	Sun, 24 Sep 2000 21:53:19 +0100 (BST)
	(envelope-from brian@hak.lan.Awfulhak.org)
Message-Id: <200009242053.e8OKrJx29096@hak.lan.Awfulhak.org>
X-Mailer: exmh version 2.1.1 10/15/1999
To: mjacob@feral.com
Cc: Brian Somers <brian@Awfulhak.org>,
	Greg Lehey <grog@wantadilla.lemis.com>, Chuck Paterson <cp@bsdi.com>,
	Archie Cobbs <archie@whistle.com>,
	Joerg Micheel <joerg@cs.waikato.ac.nz>,
	Frank Mayhar <frank@exit.com>, John Baldwin <jhb@pike.osd.bsdi.com>,
	Mark Murray <markm@FreeBSD.org>, FreeBSD-arch@FreeBSD.org,
	brian@Awfulhak.org
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files src/sys/sys random.h src/sys/dev/randomdev hash.c hash.h harvest.c randomdev.c yarrow.c yarro) 
In-Reply-To: Message from Matthew Jacob <mjacob@feral.com> 
   of "Sun, 24 Sep 2000 09:37:27 PDT." <Pine.GSO.4.21.0009240936050.8550-100000@bird.feral.com> 
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Sun, 24 Sep 2000 21:53:19 +0100
From: Brian Somers <brian@Awfulhak.org>
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> 
> > I agree - the idea of recursive mutices evil and should go, but the 
> > idea of an owner should not.  It's nice to be able to write code that 
> > KASSERTs that it already owns a given mutex.
> 
> I'm  not sure I agree. Having lived through Solaris hell with recursive mutex
> panics, I rather like the BSD/OS approach.
> 
> Yes, possibly allows for sloppy coding. If you get rid of this, though, you
> can extend the switchover and pain for SMP at least a year.

Maybe a whinge rather than an ASSERT in the mutex code would be more 
appropriate.  I've had recursive mutex panics in Solaris, and it 
meant I was doing something wrong.  A panic was a bit harsh, but it 
still led me to note that I was misusing the kstat stuff and made me 
fix my code - something I wouldn't have done if it wasn't pointed out 
for me.

> -matt

-- 
Brian <brian@Awfulhak.org>                        <brian@[uk.]FreeBSD.org>
      <http://www.Awfulhak.org>                   <brian@[uk.]OpenBSD.org>
Don't _EVER_ lose your sense of humour !


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sun Sep 24 14:18:44 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from peach.ocn.ne.jp (peach.ocn.ne.jp [210.145.254.87])
	by hub.freebsd.org (Postfix) with ESMTP
	id 7EF4F37B424; Sun, 24 Sep 2000 14:18:41 -0700 (PDT)
Received: from newsguy.com (p04-dn01kiryunisiki.gunma.ocn.ne.jp [211.0.245.5])
	by peach.ocn.ne.jp (8.9.1a/OCN/) with ESMTP id GAA24755;
	Mon, 25 Sep 2000 06:18:39 +0900 (JST)
Message-ID: <39CE6F78.DF545ED@newsguy.com>
Date: Mon, 25 Sep 2000 06:17:44 +0900
From: "Daniel C. Sobral" <dcs@newsguy.com>
X-Mailer: Mozilla 4.7 [en] (Win98; I)
X-Accept-Language: en,pt-BR
MIME-Version: 1.0
To: Robert Watson <rwatson@FreeBSD.ORG>
Cc: Barry Pederson <bpederson@geocities.com>, arch@FreeBSD.ORG
Subject: Re: Snapshots in the Fast Filesystem
References: <Pine.NEB.3.96L.1000924162624.46412A-100000@fledge.watson.org>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

Robert Watson wrote:
> 
> I won't address the other issues discussed in your email, although I do
> have some thoughts on them, but will address this one.  Snapshot files
> have the SF_SNAPSHOT file flag set on them -- I believe this is not
> cleared by ufs_getattr() and hence is probably exposed via stat().  I'm
> not sure our ls -ol output understands the snapshot flag, but a custom
> modification to ls, or a manual tool for stating and identifying files
> with the flag set sounds like it should work.  That said, I haven't tried
> this :-).

In addition to ls, find could make good use of understanding said flag.

-- 
Daniel C. Sobral			(8-DCS)
dcs@newsguy.com
dcs@freebsd.org
capo@the.secret.bsdconspiracy.net

	"I demand that my picture show a handsome face, even if it doesn't look
like me."


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sun Sep 24 17:44:48 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from feral.com (feral.com [192.67.166.1])
	by hub.freebsd.org (Postfix) with ESMTP
	id 7372F37B424; Sun, 24 Sep 2000 17:44:46 -0700 (PDT)
Received: from bird (bird.feral.com [192.67.166.155])
	by feral.com (8.9.3/8.9.3) with ESMTP id RAA01181;
	Sun, 24 Sep 2000 17:44:17 -0700
Date: Sun, 24 Sep 2000 17:44:17 -0700 (PDT)
From: Matthew Jacob <mjacob@feral.com>
Reply-To: mjacob@feral.com
To: Brian Somers <brian@Awfulhak.org>
Cc: Greg Lehey <grog@wantadilla.lemis.com>,
	Chuck Paterson <cp@bsdi.com>, Archie Cobbs <archie@whistle.com>,
	Joerg Micheel <joerg@cs.waikato.ac.nz>,
	Frank Mayhar <frank@exit.com>, John Baldwin <jhb@pike.osd.bsdi.com>,
	Mark Murray <markm@FreeBSD.ORG>, FreeBSD-arch@FreeBSD.ORG
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files
 src/sys/sys random.h src/sys/dev/randomdev hash.c hash.h harvest.c randomdev.c
 yarrow.c yarro) 
In-Reply-To: <200009242053.e8OKrJx29096@hak.lan.Awfulhak.org>
Message-ID: <Pine.GSO.4.21.0009241742260.506-100000@bird.feral.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> 
> Maybe a whinge rather than an ASSERT in the mutex code would be more 
> appropriate.  I've had recursive mutex panics in Solaris, and it 
> meant I was doing something wrong.  A panic was a bit harsh, but it 
> still led me to note that I was misusing the kstat stuff and made me 
> fix my code - something I wouldn't have done if it wasn't pointed out 
> for me.

Sure. And when we the network stack and CAM and the VFS layer are re-thought
out to know how to deal with reentrancy, then I'll be happy to have
non-recursive locks.

You're missing the point. If you're on Solaris, you are making a mistake in
your coding if you're recursing. If you're on FreeBSD, then too many things
have still to be redesigned to make that claim.

-matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sun Sep 24 20:15:53 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from smtp04.primenet.com (smtp04.primenet.com [206.165.6.134])
	by hub.freebsd.org (Postfix) with ESMTP
	id 0A2E637B422; Sun, 24 Sep 2000 20:15:47 -0700 (PDT)
Received: (from daemon@localhost)
	by smtp04.primenet.com (8.9.3/8.9.3) id UAA10301;
	Sun, 24 Sep 2000 20:13:10 -0700 (MST)
Received: from usr05.primenet.com(206.165.6.205)
 via SMTP by smtp04.primenet.com, id smtpdAAAXaayeu; Sun Sep 24 20:13:08 2000
Received: (from tlambert@localhost)
	by usr05.primenet.com (8.8.5/8.8.5) id UAA04888;
	Sun, 24 Sep 2000 20:15:30 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <200009250315.UAA04888@usr05.primenet.com>
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files src/sys/sys random.h src/sys/dev/randomdev hash.c hash.h harvest
To: cp@bsdi.com (Chuck Paterson)
Date: Mon, 25 Sep 2000 03:15:30 +0000 (GMT)
Cc: grog@wantadilla.lemis.com (Greg Lehey),
	archie@whistle.com (Archie Cobbs), brian@awfulhak.org (Brian Somers),
	joerg@cs.waikato.ac.nz (Joerg Micheel),
	mjacob@feral.com (Matthew Jacob), frank@exit.com (Frank Mayhar),
	jhb@pike.osd.bsdi.com (John Baldwin),
	markm@FreeBSD.ORG (Mark Murray), FreeBSD-arch@FreeBSD.ORG
In-Reply-To: <200009241948.NAA25438@berserker.bsdi.com> from "Chuck Paterson" at Sep 24, 2000 01:48:45 PM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> First a general comment. The main reason to not hold a mutex across
> an async event is not because it won't work, but because it means
> that we loose the ability to detect dead locks.  If process A holds
> mutex bar during a wait for async event, such as msleep(), then it
> becomes a requirment that the process which is going to wake up
> process A doesn't block on mutex foo, or have any dependencies even
> many removed on something that requires mutex bar.

Yes.

The appropriate tool for doing this type of thing is a condition
variable.  The condition is tested under mutex protection.  If
false, the thread blocks on the variable and atomically releases
the mutex.  When the condition is satisfied, the variable is
changed (again, under the protection of the mutex), and one or
more threads waiting on the condition are signalled.  The thread(s)
signalled will attempt to reacquire the mutex, and, when successful,
examine the variable, and take appropriate action, which might be to
go back to sleep, if the condition is no longer satisfied, due to a
lost race.

					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sun Sep 24 20:18:49 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from smtp03.primenet.com (smtp03.primenet.com [206.165.6.133])
	by hub.freebsd.org (Postfix) with ESMTP id 4875937B43F
	for <arch@FreeBSD.ORG>; Sun, 24 Sep 2000 20:18:30 -0700 (PDT)
Received: (from daemon@localhost)
	by smtp03.primenet.com (8.9.3/8.9.3) id UAA00568
	for <arch@FreeBSD.ORG>; Sun, 24 Sep 2000 20:17:03 -0700 (MST)
Received: from usr05.primenet.com(206.165.6.205)
 via SMTP by smtp03.primenet.com, id smtpdAAAN1aicb; Sun Sep 24 20:16:51 2000
Received: (from tlambert@localhost)
	by usr05.primenet.com (8.8.5/8.8.5) id UAA04938
	for arch@FreeBSD.ORG; Sun, 24 Sep 2000 20:18:12 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <200009250318.UAA04938@usr05.primenet.com>
Subject: Re: Mutexes and semaphores
To: arch@FreeBSD.ORG
Date: Mon, 25 Sep 2000 03:18:12 +0000 (GMT)
In-Reply-To: <200009241833.LAA00463@vashon.polstra.com> from "John Polstra" at Sep 24, 2000 11:33:23 AM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> There are plenty of reasonable situations where you have a block of
> code (say, a function) and a certain mutex needs to be locked while
> it executes.  The function might be called from several different
> places.  Maybe all of the call sites already hold the mutex, and
> maybe they don't.  Maybe it is hard to say for sure.  Maybe new calls
> will be added in the future which will add further uncertainty.  With
> recursive mutexes you can make the code robust by locking the mutex
> inside the called function.  This robustness is certain and it is
> independent of what is going on in the rest of the system.

This is evil.  You are using a mutex to protect code, when you
should be using it to protect data.  If you want to protect code,
you should use a semaphore, instead.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sun Sep 24 20:31: 0 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from smtp02.primenet.com (smtp02.primenet.com [206.165.6.132])
	by hub.freebsd.org (Postfix) with ESMTP
	id 0207937B424; Sun, 24 Sep 2000 20:30:57 -0700 (PDT)
Received: (from daemon@localhost)
	by smtp02.primenet.com (8.9.3/8.9.3) id UAA16383;
	Sun, 24 Sep 2000 20:27:46 -0700 (MST)
Received: from usr05.primenet.com(206.165.6.205)
 via SMTP by smtp02.primenet.com, id smtpdAAA.TaGOF; Sun Sep 24 20:27:31 2000
Received: (from tlambert@localhost)
	by usr05.primenet.com (8.8.5/8.8.5) id UAA05163;
	Sun, 24 Sep 2000 20:30:08 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <200009250330.UAA05163@usr05.primenet.com>
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files
To: mjacob@feral.com
Date: Mon, 25 Sep 2000 03:30:08 +0000 (GMT)
Cc: brian@Awfulhak.org (Brian Somers),
	grog@wantadilla.lemis.com (Greg Lehey), cp@bsdi.com (Chuck Paterson),
	archie@whistle.com (Archie Cobbs),
	joerg@cs.waikato.ac.nz (Joerg Micheel),
	frank@exit.com (Frank Mayhar), jhb@pike.osd.bsdi.com (John Baldwin),
	markm@FreeBSD.ORG (Mark Murray), FreeBSD-arch@FreeBSD.ORG
In-Reply-To: <Pine.GSO.4.21.0009241742260.506-100000@bird.feral.com> from "Matthew Jacob" at Sep 24, 2000 05:44:17 PM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> Sure. And when we the network stack and CAM and the VFS layer are re-thought
> out to know how to deal with reentrancy, then I'll be happy to have
> non-recursive locks.

This is easy: mark them non-reentrant.  You can either acquire a
mutex on descent into them and release it on exit/sleep, or (and
this is better), have a per-module mutex that's acquired on the
descent/wakeup and released on the ascent, if the flag is present.
This will let the modules be corrected on a per FS and per CAM
driver basis, while maintaining legacy compatability.  We do not
need another ethnic clensing of the drivers, such as what we went
through when CAM went in, or when the X.25 and ISODE stuff was
murdered.


> You're missing the point. If you're on Solaris, you are making a mistake in
> your coding if you're recursing. If you're on FreeBSD, then too many things
> have still to be redesigned to make that claim.

I think he understands that, I just think he's unwilling to live
with a kludge, which will have no incentive to be de-kludged, as
it wouldn't actually not work.

It's much better to be able to _know_ what code is OK and what
code isn't, instead of pretending that it's all OK, when it's not.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sun Sep 24 20:39:31 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from feral.com (feral.com [192.67.166.1])
	by hub.freebsd.org (Postfix) with ESMTP
	id 61DF437B424; Sun, 24 Sep 2000 20:39:28 -0700 (PDT)
Received: from bird (bird.feral.com [192.67.166.155])
	by feral.com (8.9.3/8.9.3) with ESMTP id UAA01491;
	Sun, 24 Sep 2000 20:38:56 -0700
Date: Sun, 24 Sep 2000 20:38:56 -0700 (PDT)
From: Matthew Jacob <mjacob@feral.com>
Reply-To: mjacob@feral.com
To: Terry Lambert <tlambert@primenet.com>
Cc: Brian Somers <brian@Awfulhak.org>,
	Greg Lehey <grog@wantadilla.lemis.com>, Chuck Paterson <cp@bsdi.com>,
	Archie Cobbs <archie@whistle.com>,
	Joerg Micheel <joerg@cs.waikato.ac.nz>,
	Frank Mayhar <frank@exit.com>, John Baldwin <jhb@pike.osd.bsdi.com>,
	Mark Murray <markm@FreeBSD.ORG>, FreeBSD-arch@FreeBSD.ORG
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files
In-Reply-To: <200009250330.UAA05163@usr05.primenet.com>
Message-ID: <Pine.GSO.4.21.0009242036250.506-100000@bird.feral.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


> 
> This is easy: mark them non-reentrant.  You can either acquire a
> mutex on descent into them and release it on exit/sleep, or (and
> this is better), have a per-module mutex that's acquired on the
> descent/wakeup and released on the ascent, if the flag is present.
> This will let the modules be corrected on a per FS and per CAM
> driver basis, while maintaining legacy compatability.  We do not
> need another ethnic clensing of the drivers, such as what we went
> through when CAM went in, or when the X.25 and ISODE stuff was
> murdered.

Hmm, but I sure don't want the pain of the 'unsafe_driver' mutex that Sun went
thru. Still- your point has a lot of maerit.

> 
> 
> > You're missing the point. If you're on Solaris, you are making a mistake in
> > your coding if you're recursing. If you're on FreeBSD, then too many things
> > have still to be redesigned to make that claim.
> 
> I think he understands that, I just think he's unwilling to live
> with a kludge, which will have no incentive to be de-kludged, as
> it wouldn't actually not work.

Whatever... :-)

> 
> It's much better to be able to _know_ what code is OK and what
> code isn't, instead of pretending that it's all OK, when it's not.

Aw, that's not what I was getting at.

I think getting the current set going should be allowed to proceed as is. If
there is a roadmap for strengthening the semantics, great. Just don't make the
bar too high at first.

-matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sun Sep 24 21:12:43 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from smtp04.primenet.com (smtp04.primenet.com [206.165.6.134])
	by hub.freebsd.org (Postfix) with ESMTP
	id 89BC937B422; Sun, 24 Sep 2000 21:12:33 -0700 (PDT)
Received: (from daemon@localhost)
	by smtp04.primenet.com (8.9.3/8.9.3) id UAA06514;
	Sun, 24 Sep 2000 20:00:28 -0700 (MST)
Received: from usr05.primenet.com(206.165.6.205)
 via SMTP by smtp04.primenet.com, id smtpdAAAwnaWqm; Sun Sep 24 20:00:07 2000
Received: (from tlambert@localhost)
	by usr05.primenet.com (8.8.5/8.8.5) id UAA04620;
	Sun, 24 Sep 2000 20:02:13 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <200009250302.UAA04620@usr05.primenet.com>
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files src/sys/sys random.h src/sys/dev/randomdev hash.c hash.h harvest
To: grog@wantadilla.lemis.com (Greg Lehey)
Date: Mon, 25 Sep 2000 03:02:13 +0000 (GMT)
Cc: cp@bsdi.com (Chuck Paterson), archie@whistle.com (Archie Cobbs),
	brian@awfulhak.org (Brian Somers),
	joerg@cs.waikato.ac.nz (Joerg Micheel),
	mjacob@feral.com (Matthew Jacob), frank@exit.com (Frank Mayhar),
	jhb@pike.osd.bsdi.com (John Baldwin),
	markm@FreeBSD.ORG (Mark Murray), FreeBSD-arch@FreeBSD.ORG
In-Reply-To: <20000924154216.D512@wantadilla.lemis.com> from "Greg Lehey" at Sep 24, 2000 03:42:16 PM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> >> A MUTEX is just a sepaphore whose initial count is 1.
> >>
> >> ??
> >
> > In general this might be true, but in specific it isn't. 
> 
> As you know, I used to say exactly the same thing as Archie, but I've
> realized that this implied count of 1 causes a couple of important
> differences.  I'm still working on a clearer definition, but what I've
> seen so far is:
> 
> 1.  Because "mutexes" (I really hate this term; I wish I could find a
>     better one) only have an implied count of one, they can also have
>     the concept of an owner, which we use.
> 
> 2.  Because the mutex has an owner, only the owner can release it.
> 
> 3.  The mutex can also be "recursive" (it's really iterative, I
>     suppose): the owner can take it several times.  The only reason
>     for this appears to be sloppy coding, but in the short term I
>     think we're agreed that we can't dispose of that.
> 
> One thing that I don't think is important is the duration of
> ownership.  We currently use mutexes for short periods of time, which
> is why we have the spin version.

Actually, that's crucial, since it defines the conflict domain;
you can acquire a heavy lock after contending with a spin lock
for the right to acquire the heavy lock.  In most cases, where
the heavy lock is held a short time, you won't have any contention,
and thus can quickly grant the resource.

In the case of a long held resource, the contention domain is such
that the resource is probably contended, and has waiters outstanding;
this means that the release case (and thus the acquisition case)
must be much heavier weight.

I recommend:

<http://www.cl.cam.ac.uk/Research/SRG/netos/pegasus/reports/TR361.txt>
<http://www.wam.umd.edu/whats_new/workshop3.0/common-tools/numerical_comp_guide/ncg_glossary.doc.html>

This second is a rather good glossary, which is duplicated in many
places on the net.


> At Tandem, we only used semaphores, but they always had a count of 1,
> so they were effectively very close to our mutexes.  They didn't allow
> recursion, which is the Right Thing in a system designed from the
> ground up, but they also didn't have owners.  One of the most frequent
> complicated problems we had were system hangs (deadlocks), and we
> frequently couldn't figure out who had done what and why.  Having
> owners is a great debug aid.

I think that we need to be very clear on one thing: you can recurse
on a semaphore, but a true mutex will not permit recursion; it is a
light weight object, and has very little content.  It lacks a recurse
count, and many other attributes of semaphores.

Microsoft actually got this right in Windows, surprisingly.

When you attempt to get a mutex you already hold, you are shooting
yourself in the foot; it means that you didn't track the resource
sufficiently.  Usually this occurs when a mutex is acquired at one
level, and released at another, or worse, when it is acquired in the
wrong place (e.g. a subroutine called several times from a higher
level routine, which should be acquiring the mutex instead).

Disallowing recursion, mutex ownership is therefore implicit by
virtue of the holder of the mutex holding it.

In the case of a starvation or deadly embrace deadlock, one need
only get a stack trace of processes currently in the kernel to
determine where the problem lives; however, an owner would make
this rather automatic, and could aid debugging, as you say.  I do
have a problem with this approach, however, since it makes it much
more likely that people will be sloppy, and then wait for deadlocks
to be reported, rather than thinking through their code and ensuring
that deadlocks are not possible in the first place.  The idea that
fixing deadlocks in released code, rather than releasing only code
without deadlocks, is an acceptable approach needs to be discouraged.


> If we can expect that the mutex will, on average, be freed in less
> time than it would take to schedule a new process, spin locks can be a
> better alternative.  Otherwise we wouldn't need them at all.
> 
> Anyway, this doesn't directly relate to semaphores.  We have the basic
> issue of atomicity, which in general can be handled without spin
> locks, and that would apply to semaphores just as much as to mutexes.

The advantage to a test-and-set spin prior to acquisition of a
mutex is that the mutex can be acquired without taking a cache
synchronization hit between processors, which would otherwise be
necessary.  Some cache synchronization events will inevitably
occur, but they will be much less frequent.  The mutexes themselves
can be in non-cached pages, to accomplish this.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sun Sep 24 22:33:34 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from earth.backplane.com (placeholder-dcat-1076843290.broadbandoffice.net [64.47.83.26])
	by hub.freebsd.org (Postfix) with ESMTP id 7FA8437B43C
	for <arch@FreeBSD.ORG>; Sun, 24 Sep 2000 22:33:31 -0700 (PDT)
Received: (from dillon@localhost)
	by earth.backplane.com (8.11.0/8.9.3) id e8P5XKg79352;
	Sun, 24 Sep 2000 22:33:20 -0700 (PDT)
	(envelope-from dillon)
Date: Sun, 24 Sep 2000 22:33:20 -0700 (PDT)
From: Matt Dillon <dillon@earth.backplane.com>
Message-Id: <200009250533.e8P5XKg79352@earth.backplane.com>
To: John Polstra <jdp@polstra.com>
Cc: arch@FreeBSD.ORG
Subject: Re: Mutexes and semaphores
References: <200009241026.e8OAQVx26206@hak.lan.Awfulhak.org> <200009241833.LAA00463@vashon.polstra.com>
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


:
:In article <200009241026.e8OAQVx26206@hak.lan.Awfulhak.org>,
:Brian Somers  <brian@Awfulhak.org> wrote:
:> > 3.  The mutex can also be "recursive" (it's really iterative, I
:> >     suppose): the owner can take it several times.  The only reason
:> >     for this appears to be sloppy coding, but in the short term I
:> >     think we're agreed that we can't dispose of that.
:> 
:> I agree - the idea of recursive mutices evil and should go, but the 
:> idea of an owner should not.  It's nice to be able to write code that 
:> KASSERTs that it already owns a given mutex.
:
:I disagree that recursive mutexes are bad, and I don't think "sloppy
:coding" is the right way to look at them.  I would argue that
:recursive mutexes allow robust code to be written based solely on
:knowledge of the immediately surrounding code, and that is a Good
:Thing.
:
:There are plenty of reasonable situations where you have a block of
:code (say, a function) and a certain mutex needs to be locked while
:it executes.  The function might be called from several different
:places.  Maybe all of the call sites already hold the mutex, and
:maybe they don't.  Maybe it is hard to say for sure.  Maybe new calls
:will be added in the future which will add further uncertainty.  With
:recursive mutexes you can make the code robust by locking the mutex
:inside the called function.  This robustness is certain and it is
:independent of what is going on in the rest of the system.
:
:Just look at the traditional kernel with respect to the spl*() calls.
:...

    I gotta gree with John on this.  Recursive mutexes can be coded
    properly.  The best example of this is when you have a module which
    implements an API, and to simplify the code you want one API function
    to call another in the same module.

    The case where one API function may wish to call another is one that
    occurs quite often in the kernel.  For example, managing ref counts
    on objects.  If you don't have recursive mutexes, then you do not
    have the ability to call your own API recursively (at least not
    without creating a mess).  You are instead forced to split the API
    into a high-level and a low-level piece in order to be able to bypass the
    high-level piece.  Yuch.

    The syscall API is a good example of what happens when you can't
    call your own API.  For the FreeBSD kernel (and most UNIX kernels that I know),
    it is relatively dangerous for one system call to call another system call's
    entry point.  The inability has created a mess out of things like NFS and
    other code elements that use internal descriptors.  The last embedded OS I
    did allowed system calls to make system calls and it was like night
    and day.  Things like in-kernel high-level descriptor use became trivial.

						-Matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Sep 25  1:28:37 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from smtp03.primenet.com (smtp03.primenet.com [206.165.6.133])
	by hub.freebsd.org (Postfix) with ESMTP
	id 1E1DF37B424; Mon, 25 Sep 2000 01:28:34 -0700 (PDT)
Received: (from daemon@localhost)
	by smtp03.primenet.com (8.9.3/8.9.3) id BAA05586;
	Mon, 25 Sep 2000 01:26:56 -0700 (MST)
Received: from usr02.primenet.com(206.165.6.202)
 via SMTP by smtp03.primenet.com, id smtpdAAARlaOZk; Mon Sep 25 01:26:46 2000
Received: (from tlambert@localhost)
	by usr02.primenet.com (8.8.5/8.8.5) id BAA12659;
	Mon, 25 Sep 2000 01:27:56 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <200009250827.BAA12659@usr02.primenet.com>
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files
To: mjacob@feral.com
Date: Mon, 25 Sep 2000 08:27:55 +0000 (GMT)
Cc: tlambert@primenet.com (Terry Lambert),
	brian@Awfulhak.org (Brian Somers),
	grog@wantadilla.lemis.com (Greg Lehey), cp@bsdi.com (Chuck Paterson),
	archie@whistle.com (Archie Cobbs),
	joerg@cs.waikato.ac.nz (Joerg Micheel),
	frank@exit.com (Frank Mayhar), jhb@pike.osd.bsdi.com (John Baldwin),
	markm@FreeBSD.ORG (Mark Murray), FreeBSD-arch@FreeBSD.ORG
In-Reply-To: <Pine.GSO.4.21.0009242036250.506-100000@bird.feral.com> from "Matthew Jacob" at Sep 24, 2000 08:38:56 PM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> > This is easy: mark them non-reentrant.  You can either acquire a
> > mutex on descent into them and release it on exit/sleep, or (and
> > this is better), have a per-module mutex that's acquired on the
> > descent/wakeup and released on the ascent, if the flag is present.
> > This will let the modules be corrected on a per FS and per CAM
> > driver basis, while maintaining legacy compatability.  We do not
> > need another ethnic clensing of the drivers, such as what we went
> > through when CAM went in, or when the X.25 and ISODE stuff was
> > murdered.
> 
> Hmm, but I sure don't want the pain of the 'unsafe_driver' mutex that Sun went
> thru. Still- your point has a lot of maerit.

One of the best things that UnixWare had going for it was the
ability to continue using legacy drivers, file systems, and
streams stacks, while those components that were reentrant
were capable of giving better performance.

UnixWare on a UP box was capable of 30% better performance,
even after all of the SMP overhead, simply because the
system was mostly reentrant (this with the terrible hit
that the network stack took trying to use ODI drivers).

I think that no matter how you slice it, it has to be possible
for something to be done right, and to tell the difference
between those things that are and aren't reentrant, easily
and unequivocally.

With the suggested mutex recursion (please -- use a counting
semaphore, not a mutex, if you are going to permit recursion!),
the only way would be to instrument the mutex acquisition
macro to whine to the console any time the count increments
after it reaches a value of 1.

If you are willing to whine about recursion, then I suppose
that having recursion would not be that bad; but turn off
the whining, and there's little incentive to fix it, since
to many people's minds, it won't be broken.  8-(.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Sep 25  3: 4:46 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from pcnet1.pcnet.com (pcnet1.pcnet.com [204.213.232.3])
	by hub.freebsd.org (Postfix) with ESMTP id 4689837B422
	for <arch@FreeBSD.ORG>; Mon, 25 Sep 2000 03:04:43 -0700 (PDT)
Received: (from eischen@localhost)
	by pcnet1.pcnet.com (8.8.7/PCNet) id GAA16555;
	Mon, 25 Sep 2000 06:04:18 -0400 (EDT)
Date: Mon, 25 Sep 2000 06:04:18 -0400 (EDT)
From: Daniel Eischen <eischen@vigrid.com>
To: Matt Dillon <dillon@earth.backplane.com>
Cc: John Polstra <jdp@polstra.com>, arch@FreeBSD.ORG
Subject: Re: Mutexes and semaphores
In-Reply-To: <200009250533.e8P5XKg79352@earth.backplane.com>
Message-ID: <Pine.SUN.3.91.1000925055843.15658A-100000@pcnet1.pcnet.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

On Sun, 24 Sep 2000, Matt Dillon wrote:
> 
>     I gotta gree with John on this.  Recursive mutexes can be coded
>     properly.  The best example of this is when you have a module which
>     implements an API, and to simplify the code you want one API function
>     to call another in the same module.
> 
>     The case where one API function may wish to call another is one that
>     occurs quite often in the kernel.  For example, managing ref counts
>     on objects.  If you don't have recursive mutexes, then you do not
>     have the ability to call your own API recursively (at least not
>     without creating a mess).  You are instead forced to split the API
>     into a high-level and a low-level piece in order to be able to bypass the
>     high-level piece.  Yuch.
> 
>     The syscall API is a good example of what happens when you can't
>     call your own API.  For the FreeBSD kernel (and most UNIX kernels that I know),
>     it is relatively dangerous for one system call to call another system call's
>     entry point.  The inability has created a mess out of things like NFS and
>     other code elements that use internal descriptors.  The last embedded OS I
>     did allowed system calls to make system calls and it was like night
>     and day.  Things like in-kernel high-level descriptor use became trivial.

Mutexes should protect data.  If you want to allow recursive ownership of
data, then keep your own owner and ref count field in the protected data
and use the mutex properly (release it after setting the owner or 
incrementing the ref count).  You don't need to hold the mutex, and
now you can use the same mutex for msleep/cv_wait.

-- 
Dan Eischen


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Sep 25  4:56:33 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from peach.ocn.ne.jp (peach.ocn.ne.jp [210.145.254.87])
	by hub.freebsd.org (Postfix) with ESMTP
	id 8730137B42C; Mon, 25 Sep 2000 04:56:30 -0700 (PDT)
Received: from newsguy.com (p11-dn02kiryunisiki.gunma.ocn.ne.jp [211.0.245.76])
	by peach.ocn.ne.jp (8.9.1a/OCN/) with ESMTP id UAA25291;
	Mon, 25 Sep 2000 20:55:35 +0900 (JST)
Message-ID: <39CF3CFF.47E3E8F6@newsguy.com>
Date: Mon, 25 Sep 2000 20:54:39 +0900
From: "Daniel C. Sobral" <dcs@newsguy.com>
X-Mailer: Mozilla 4.7 [en] (Win98; I)
X-Accept-Language: en,pt-BR
MIME-Version: 1.0
To: Terry Lambert <tlambert@primenet.com>
Cc: Greg Lehey <grog@wantadilla.lemis.com>,
	Chuck Paterson <cp@bsdi.com>, Archie Cobbs <archie@whistle.com>,
	Brian Somers <brian@awfulhak.org>,
	Joerg Micheel <joerg@cs.waikato.ac.nz>,
	Matthew Jacob <mjacob@feral.com>, Frank Mayhar <frank@exit.com>,
	John Baldwin <jhb@pike.osd.bsdi.com>,
	Mark Murray <markm@FreeBSD.ORG>, FreeBSD-arch@FreeBSD.ORG
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files 
 src/sys/sys random.h src/sys/dev/randomdev hash.c hash.h harvest
References: <200009250302.UAA04620@usr05.primenet.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

Terry Lambert wrote:
> 
> In the case of a starvation or deadly embrace deadlock, one need
> only get a stack trace of processes currently in the kernel to
> determine where the problem lives; however, an owner would make
> this rather automatic, and could aid debugging, as you say.  I do
> have a problem with this approach, however, since it makes it much
> more likely that people will be sloppy, and then wait for deadlocks
> to be reported, rather than thinking through their code and ensuring
> that deadlocks are not possible in the first place.  The idea that

Just in case you haven't noticed, you just defended lack of debugging
aids on the grounds that people will code better in their absence.

Let's take this opportunity and make us completely incompatible with gdb
too. Without gdb, people will have to think much better about their
code, since debugging will be very hard.

-- 
Daniel C. Sobral			(8-DCS)
dcs@newsguy.com
dcs@freebsd.org
capo@the.secret.bsdconspiracy.net

	"I demand that my picture show a handsome face, even if it doesn't look
like me."


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Sep 25  8:13:53 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from tinker.exit.com (tinker.exit.com [206.223.0.1])
	by hub.freebsd.org (Postfix) with ESMTP id 0E96D37B422
	for <FreeBSD-arch@FreeBSD.ORG>; Mon, 25 Sep 2000 08:13:51 -0700 (PDT)
Received: from realtime.exit.com (realtime [206.223.0.5])
	by tinker.exit.com (8.11.0/8.11.0) with ESMTP id e8PFDjO08881;
	Mon, 25 Sep 2000 08:13:45 -0700 (PDT)
	(envelope-from frank@exit.com)
Received: (from frank@localhost)
	by realtime.exit.com (8.11.0/8.11.0) id e8PFET802275;
	Mon, 25 Sep 2000 08:14:29 -0700 (PDT)
	(envelope-from frank)
From: Frank Mayhar <frank@exit.com>
Message-Id: <200009251514.e8PFET802275@realtime.exit.com>
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files
In-Reply-To: <200009250827.BAA12659@usr02.primenet.com> from Terry Lambert at
	"Sep 25, 2000 08:27:55 am"
To: Terry Lambert <tlambert@primenet.com>
Date: Mon, 25 Sep 2000 07:59:12 -0700 (PDT)
Cc: mjacob@feral.com, Brian Somers <brian@Awfulhak.org>,
	Greg Lehey <grog@wantadilla.lemis.com>, Chuck Paterson <cp@bsdi.com>,
	Archie Cobbs <archie@whistle.com>,
	Joerg Micheel <joerg@cs.waikato.ac.nz>,
	John Baldwin <jhb@pike.osd.bsdi.com>,
	Mark Murray <markm@FreeBSD.ORG>, FreeBSD-arch@FreeBSD.ORG.ORG
Reply-To: frank@exit.com
Organization: Exit Consulting
X-Copyright0: Copyright 2000 Frank Mayhar.  All Rights Reserved.
X-Copyright1: Permission granted for electronic reproduction as Usenet News or email only.
X-Mailer: ELM [version 2.4ME+ PL68 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

Terry Lambert wrote:
> With the suggested mutex recursion (please -- use a counting
> semaphore, not a mutex, if you are going to permit recursion!),

That's basically what it is, more or less.

> If you are willing to whine about recursion, then I suppose
> that having recursion would not be that bad; but turn off
> the whining, and there's little incentive to fix it, since
> to many people's minds, it won't be broken.  8-(.

Well, I can't speak for FreeBSD, but as far as BSD/OS goes, I plan to fix
this stuff.  I cut my teeth on SVR4.2 ES/MP, so I'm not used to recursive
locks anyway, and I quite agree that if the code _needs_ a recursive lock,
there's more going on there and the possibility of deadlocks is high.

My code doesn't use recursive locks.  Yeah, it's more work, but it's well
worth it in the long run.  I think it's a relatively small price to pay for
long-term reliability and for not needing to go back and reexamine everything
down the road a bit.

(I hope this makes sense; I haven't had my coffee yet.  :-/)
-- 
Frank Mayhar frank@exit.com	http://www.exit.com/
Exit Consulting                 http://store.exit.com/


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Sep 25  9:16: 7 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from feral.com (feral.com [192.67.166.1])
	by hub.freebsd.org (Postfix) with ESMTP
	id DDC2D37B424; Mon, 25 Sep 2000 09:16:04 -0700 (PDT)
Received: from bird (bird.feral.com [192.67.166.155])
	by feral.com (8.9.3/8.9.3) with ESMTP id JAA03796;
	Mon, 25 Sep 2000 09:15:20 -0700
Date: Mon, 25 Sep 2000 09:15:20 -0700 (PDT)
From: Matthew Jacob <mjacob@feral.com>
Reply-To: mjacob@feral.com
To: "Daniel C. Sobral" <dcs@newsguy.com>
Cc: Terry Lambert <tlambert@primenet.com>,
	Greg Lehey <grog@wantadilla.lemis.com>, Chuck Paterson <cp@bsdi.com>,
	Archie Cobbs <archie@whistle.com>, Brian Somers <brian@awfulhak.org>,
	Joerg Micheel <joerg@cs.waikato.ac.nz>,
	Frank Mayhar <frank@exit.com>, John Baldwin <jhb@pike.osd.bsdi.com>,
	Mark Murray <markm@FreeBSD.ORG>, FreeBSD-arch@FreeBSD.ORG
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files 
 src/sys/sys random.h src/sys/dev/randomdev hash.c hash.h harvest
In-Reply-To: <39CF3CFF.47E3E8F6@newsguy.com>
Message-ID: <Pine.GSO.4.21.0009250915020.1592-100000@bird.feral.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> 
> Let's take this opportunity and make us completely incompatible with gdb
> too. Without gdb, people will have to think much better about their
> code, since debugging will be very hard.

Since it usually doesn't work on the alpha, it won't be much of a difference
to me.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Sep 25  9:59:47 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from wall.polstra.com (rtrwan160.accessone.com [206.213.115.74])
	by hub.freebsd.org (Postfix) with ESMTP id 4966C37B42C
	for <arch@freebsd.org>; Mon, 25 Sep 2000 09:59:45 -0700 (PDT)
Received: from vashon.polstra.com (vashon.polstra.com [206.213.73.13])
	by wall.polstra.com (8.9.3/8.9.3) with ESMTP id JAA15818;
	Mon, 25 Sep 2000 09:59:38 -0700 (PDT)
	(envelope-from jdp@polstra.com)
From: John Polstra <jdp@polstra.com>
Received: (from jdp@localhost)
	by vashon.polstra.com (8.9.3/8.9.1) id JAA02227;
	Mon, 25 Sep 2000 09:59:37 -0700 (PDT)
	(envelope-from jdp@polstra.com)
Date: Mon, 25 Sep 2000 09:59:37 -0700 (PDT)
Message-Id: <200009251659.JAA02227@vashon.polstra.com>
To: arch@freebsd.org
Reply-To: arch@freebsd.org
Cc: tlambert@primenet.com
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files
In-Reply-To: <200009250827.BAA12659@usr02.primenet.com>
References: <200009250827.BAA12659@usr02.primenet.com>
Organization: Polstra & Co., Seattle, WA
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

In article <200009250827.BAA12659@usr02.primenet.com>, Terry Lambert
<tlambert@primenet.com> wrote:

> With the suggested mutex recursion (please -- use a counting
> semaphore, not a mutex, if you are going to permit recursion!),

Please explain why you think a counting semaphore has anything to do
with recursion.  To support recursion a mutual exclusion primitive
has to support the concept of ownership.  I.e., if you already own
it you can acquire it recursively, but if somebody else owns it then
you cannot.  A counting semaphore does not support that concept.  The
count is not a recursion count at all.  Search google for "counting
semaphore" and you'll find any number of introductory class notes on
semaphores.  Or cut right to the chase and go to a typical one at

http://www.erc.msstate.edu/~ioana/POWERPOINT/CS4163/slides/Threads/tsld022.htm

John
-- 
  John Polstra                                               jdp@polstra.com
  John D. Polstra & Co., Inc.                        Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Sep 25 10: 5:50 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from pike.osd.bsdi.com (pike.osd.bsdi.com [204.216.28.222])
	by hub.freebsd.org (Postfix) with ESMTP id 7462237B424
	for <arch@FreeBSD.ORG>; Mon, 25 Sep 2000 10:05:47 -0700 (PDT)
Received: from foo.osd.bsdi.com (root@foo.osd.bsdi.com [204.216.28.137])
	by pike.osd.bsdi.com (8.11.0/8.9.3) with ESMTP id e8PH5ki40585;
	Mon, 25 Sep 2000 10:05:46 -0700 (PDT)
	(envelope-from jhb@foo.osd.bsdi.com)
Received: (from jhb@localhost)
	by foo.osd.bsdi.com (8.11.0/8.11.0) id e8PH3rn36503;
	Mon, 25 Sep 2000 10:03:53 -0700 (PDT)
	(envelope-from jhb)
Message-ID: <XFMail.000925100353.jhb@FreeBSD.org>
X-Mailer: XFMail 1.4.0 on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <Pine.SUN.3.91.1000924151251.7740A-100000@pcnet1.pcnet.com>
Date: Mon, 25 Sep 2000 10:03:53 -0700 (PDT)
Organization: BSD, Inc.
From: John Baldwin <jhb@FreeBSD.ORG>
To: Daniel Eischen <eischen@vigrid.com>
Subject: Re: Mutexes and semaphores
Cc: arch@FreeBSD.ORG
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


On 24-Sep-00 Daniel Eischen wrote:
> On Sun, 24 Sep 2000, John Polstra wrote:
> But you can't then use a recursive mutex in conjunction with msleep
> (cv_wait) which forces you to use yet another mutex.  This is fine,
> but it adds confusion for the programmer.

This is a problem.  However, for one thing we currently have a
KASSERT() that panic's if you msleep() on a recursed mutex.  However,
one could also change msleep() to function like mi_switch() does
with Giant and have it fully release the lock before sleeping, but
this probably would not be a Good Thing.

> Another thing, is in
> our support for recursive mutexes is that they make the calling
> conventions overly complex (with the silly flag argumuents to
> mtx_enter()).


Uhhh.  With the exception of the mtx_enter() for sched_lock in
mi_switch() that specifies M_RLIKELY, all of the mutex flags
currently in use have _nothing_ to do with recursion.
MTX_DEF/MTX_SPIN are used to distinguish spin locks from sleep
locks.  The use of those flags is another matter for discussion,
but the flags have very, very little to do with recursion.

> If we are going to support recursive mutex, I think it would be
> better to add separate calls/macros/data types to support them,
> so the the mtx mutexes can be simplified.  Calls to mtx_enter
> with the recursive mutex type wouldn't even compile.

Err, the recursive nature of the mutexes is very trivial.  It
doesn't affect the complexity of the mutexes at all.  Most of
the "complexity" in the mutex code lies in putting processes to
sleep and waking them back up again for sleep locks, and in the
currently broken and disabled code to propagate a sleeping
process' priority to the process holding the mutex it is waiting
for.

> My $0.02 for what it's worth...
> 
> -- 
> Dan Eischen

-- 

John Baldwin <jhb@FreeBSD.org> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.Baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Sep 25 10:24:47 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from field.videotron.net (field.videotron.net [205.151.222.108])
	by hub.freebsd.org (Postfix) with ESMTP
	id 84B6137B422; Mon, 25 Sep 2000 10:24:30 -0700 (PDT)
Received: from modemcable136.203-201-24.mtl.mc.videotron.ca ([24.201.203.136])
 by field.videotron.net (Sun Internet Mail Server sims.3.5.1999.12.14.10.29.p8)
 with ESMTP id <0G1G00CEDDOINM@field.videotron.net>; Mon, 25 Sep 2000 13:24:19 -0400 (EDT)
Date: Mon, 25 Sep 2000 13:28:03 -0400 (EDT)
From: Bosko Milekic <bmilekic@technokratis.com>
Subject: Re: need advice, fsetown annoyances and mpsafeness.
In-reply-to: <20000924125311.Q9141@fw.wintelcom.net>
To: Alfred Perlstein <bright@wintelcom.net>
Cc: arch@FreeBSD.ORG, cp@FreeBSD.ORG
Message-id: <Pine.BSF.4.21.0009251314260.15801-100000@jehovah.technokratis.com>
MIME-version: 1.0
Content-type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


On Sun, 24 Sep 2000, Alfred Perlstein wrote:

> It's really a lot more evil than you think.

	Yeah, I noticed after sending the Email.

> The race is in the object (socket/tty) checking the pointer and
> then dereferencing it.
> 
> A broken solution is to lock the sigio struct or provide a backreference
> to the socket/tty lock, after banging my head against my desk for some time I came across this solution:

	This looks somewhat like what you mentionned in (2) in your earlier
  post. The sigio struct will only be freed by the object. I think this is
  a reasonable solution.

> (assuming pfind/pgfind return the proc/pgrp locked)
> 
[...]
> 	/*
> 	 * ok this is somewhat tricky, we examine what the sigio is attached
> 	 * to, whatever it is proc/pgrp we need to use the search functions
> 	 * to ensure atomicity.  If we get back ESRCH that's ok, that means
> 	 * we lost the race, just free it.
> 	 * if we get back a pointer we then need to make sure that the pgid
> 	 * hasn't been NULLed out because we lost the race between looking
> 	 * at the sigio and locking the proc/pgrp
> 	 * (most likely pid/pgid wraparound)
> 	 */
> 	pid = sigio->sio_pgid;
> 
> 	if (pid < 0) {
> 		struct pgrp *p;
> 
> 		if ((pgrp = pgfind(pid)) != NULL) {
> 			/* funsetown_proc would have set this to zero */
> 			if (sigio->sio_pgid != 0)
> 				SLIST_REMOVE(&sigio->sio_pgrp->pg_sigiolst, sigio,
> 					sigio, sio_pgsigio);
> 			PGRP_UNLOCK(&sigio->sio_pgrp);
> 		}
> 	} else if (pid > 0) {
> 		struct proc *p;
> 
> 		if ((p = pfind(pid)) != NULL) {
> 			/* funsetown_proc would have set this to zero */
> 			if (sigio->sio_pgid != 0)
> 				SLIST_REMOVE(&sigio->sio_proc->p_sigiolst, sigio,
> 					sigio, sio_pgsigio);
> 			PROC_UNLOCK(&sigio->sio_proc);
> 		}
> 	}
> 
> out:
> 	crfree(sigio->sio_ucred);
> 	FREE(sigio, M_SIGIO);
> }

	Looks good.

> /*
>  * NULL out a sigio struct attached to a process/pgrp
>  * must be called with the object (struct proc/pgrp) locked
>  * this is to be called from the perspective of the process/pgrp
>  *
>  * called from the proc/pgid at teardown
>  * proc/pgid must be locked
>  */
> void
> funsetown_proc(sigio)
> 	struct sigio *sigio;
> {
> 	int s;
> 
> 	if (sigio == NULL)
> 		return;
> 	if (sigio->sio_pgid < 0) {
> 		SLIST_REMOVE(&sigio->sio_pgrp->pg_sigiolst, sigio,
> 			     sigio, sio_pgsigio);
> 	} else /* if ((*sigiop)->sio_pgid > 0) */ {
> 		SLIST_REMOVE(&sigio->sio_proc->p_sigiolst, sigio,
> 			     sigio, sio_pgsigio);
> 	}
> 	sigio->sio_pgid = 0;
> }
> 
> /*
>  * Free a list of sigio structures.
>  *
>  * called from the proc/pgid at teardown
>  * proc/pgid must be locked
>  */
> void
> funsetownlst(sigiolst)
> 	struct sigiolst *sigiolst;
> {
> 	struct sigio *sigio;
> 
> 	while ((sigio = SLIST_FIRST(sigiolst)) != NULL)
> 		funsetown(sigio);
> }
> 
> 
> Questions?  Comments?

	Question: You don't seem to be protecting the actual sigiolst list
  with a lock. What happens if you've got two different processes
  manipulating the list? Each one may be locked, but regardless, your list
  can still be trashed.

> -- 
> -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
> "I have the heart of a child; I keep it in a jar on my desk."

  Regards,

  Bosko Milekic
  bmilekic@technokratis.com


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Sep 25 10:33:10 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from falla.videotron.net (falla.videotron.net [205.151.222.106])
	by hub.freebsd.org (Postfix) with ESMTP
	id 1ED8B37B422; Mon, 25 Sep 2000 10:33:08 -0700 (PDT)
Received: from modemcable136.203-201-24.mtl.mc.videotron.ca ([24.201.203.136])
 by falla.videotron.net (Sun Internet Mail Server sims.3.5.1999.12.14.10.29.p8)
 with ESMTP id <0G1G00033E320Y@falla.videotron.net>; Mon, 25 Sep 2000 13:33:03 -0400 (EDT)
Date: Mon, 25 Sep 2000 13:36:47 -0400 (EDT)
From: Bosko Milekic <bmilekic@technokratis.com>
Subject: Re: need advice, fsetown annoyances and mpsafeness.
In-reply-to: <Pine.BSF.4.21.0009251314260.15801-100000@jehovah.technokratis.com>
To: Alfred Perlstein <bright@wintelcom.net>
Cc: arch@FreeBSD.ORG, cp@FreeBSD.ORG
Message-id: <Pine.BSF.4.21.0009251335590.15846-100000@jehovah.technokratis.com>
MIME-version: 1.0
Content-type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


On Mon, 25 Sep 2000, Bosko Milekic wrote:

> 	Question: You don't seem to be protecting the actual sigiolst list
>   with a lock. What happens if you've got two different processes
>   manipulating the list? Each one may be locked, but regardless, your list
>   can still be trashed.
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    Nevermind, please disregard.

    *blushes*


  Bosko Milekic
  bmilekic@technokratis.com


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Sep 25 10:36:23 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20])
	by hub.freebsd.org (Postfix) with ESMTP
	id 415C837B422; Mon, 25 Sep 2000 10:36:22 -0700 (PDT)
Received: (from bright@localhost)
	by fw.wintelcom.net (8.10.0/8.10.0) id e8PHaKS29850;
	Mon, 25 Sep 2000 10:36:20 -0700 (PDT)
Date: Mon, 25 Sep 2000 10:36:20 -0700
From: Alfred Perlstein <bright@wintelcom.net>
To: Bosko Milekic <bmilekic@technokratis.com>
Cc: arch@FreeBSD.ORG, cp@FreeBSD.ORG
Subject: Re: need advice, fsetown annoyances and mpsafeness.
Message-ID: <20000925103620.W9141@fw.wintelcom.net>
References: <Pine.BSF.4.21.0009251314260.15801-100000@jehovah.technokratis.com> <Pine.BSF.4.21.0009251335590.15846-100000@jehovah.technokratis.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.4i
In-Reply-To: <Pine.BSF.4.21.0009251335590.15846-100000@jehovah.technokratis.com>; from bmilekic@technokratis.com on Mon, Sep 25, 2000 at 01:36:47PM -0400
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

* Bosko Milekic <bmilekic@technokratis.com> [000925 10:33] wrote:
> 
> On Mon, 25 Sep 2000, Bosko Milekic wrote:
> 
> > 	Question: You don't seem to be protecting the actual sigiolst list
> >   with a lock. What happens if you've got two different processes
> >   manipulating the list? Each one may be locked, but regardless, your list
> >   can still be trashed.
>     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
>     Nevermind, please disregard.
> 
>     *blushes*

You understand that it's blocked by the lock on the process/pgrp
right?

:)

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Sep 25 11: 0:10 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from pcnet1.pcnet.com (pcnet1.pcnet.com [204.213.232.3])
	by hub.freebsd.org (Postfix) with ESMTP
	id 54B7337B446; Mon, 25 Sep 2000 11:00:03 -0700 (PDT)
Received: (from eischen@localhost)
	by pcnet1.pcnet.com (8.8.7/PCNet) id NAA21201;
	Mon, 25 Sep 2000 13:59:45 -0400 (EDT)
Date: Mon, 25 Sep 2000 13:59:45 -0400 (EDT)
From: Daniel Eischen <eischen@vigrid.com>
To: John Baldwin <jhb@FreeBSD.ORG>
Cc: arch@FreeBSD.ORG
Subject: Re: Mutexes and semaphores
In-Reply-To: <XFMail.000925100353.jhb@FreeBSD.org>
Message-ID: <Pine.SUN.3.91.1000925134059.18678A@pcnet1.pcnet.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

On Mon, 25 Sep 2000, John Baldwin wrote:
> 
> On 24-Sep-00 Daniel Eischen wrote:
> > On Sun, 24 Sep 2000, John Polstra wrote:
> > But you can't then use a recursive mutex in conjunction with msleep
> > (cv_wait) which forces you to use yet another mutex.  This is fine,
> > but it adds confusion for the programmer.
> 
> This is a problem.  However, for one thing we currently have a
> KASSERT() that panic's if you msleep() on a recursed mutex.  However,
> one could also change msleep() to function like mi_switch() does
> with Giant and have it fully release the lock before sleeping, but
> this probably would not be a Good Thing.

A compile error is much better than a kernel panic.

> 
> > Another thing, is in
> > our support for recursive mutexes is that they make the calling
> > conventions overly complex (with the silly flag argumuents to
> > mtx_enter()).
> 
> 
> Uhhh.  With the exception of the mtx_enter() for sched_lock in
> mi_switch() that specifies M_RLIKELY, all of the mutex flags
> currently in use have _nothing_ to do with recursion.
> MTX_DEF/MTX_SPIN are used to distinguish spin locks from sleep
> locks.  The use of those flags is another matter for discussion,
> but the flags have very, very little to do with recursion.

One of the reasons given for the mutex macros and flags is
that the mutex type/options can be given without having to check the
type/options in the mutex structure.  If this isn't true,
then get rid of the hideous flags to mtx_enter and mtx_exit.
Optimize for a free lock, and take the hit and call a C program
if the lock is held to check the mutex type and do the appropriate
thing.

> 
> > If we are going to support recursive mutex, I think it would be
> > better to add separate calls/macros/data types to support them,
> > so the the mtx mutexes can be simplified.  Calls to mtx_enter
> > with the recursive mutex type wouldn't even compile.
> 
> Err, the recursive nature of the mutexes is very trivial.  It
> doesn't affect the complexity of the mutexes at all.  Most of
> the "complexity" in the mutex code lies in putting processes to
> sleep and waking them back up again for sleep locks, and in the
> currently broken and disabled code to propagate a sleeping
> process' priority to the process holding the mutex it is waiting
> for.

I still claim that recursive mutexes should not be supported by
our standard kernel mutex.  If you want to add another set of
data types and functions for recursive mutexes, OK fine.  But
with proper coding techniques, I don't see the need to hold a
mutex after fiddling with whatever data item is being protected.
Take the mutex, set the owner or increase the ref count held
in the data item to be protected, and then release the mutex
either with mtx_exit() or msleep().

-- 
Dan Eischen


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Sep 25 12:31:57 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from smtp02.primenet.com (smtp02.primenet.com [206.165.6.132])
	by hub.freebsd.org (Postfix) with ESMTP id 485D337B440
	for <arch@freebsd.org>; Mon, 25 Sep 2000 12:31:30 -0700 (PDT)
Received: (from daemon@localhost)
	by smtp02.primenet.com (8.9.3/8.9.3) id MAA19775;
	Mon, 25 Sep 2000 12:28:37 -0700 (MST)
Received: from usr02.primenet.com(206.165.6.202)
 via SMTP by smtp02.primenet.com, id smtpdAAAimaiAM; Mon Sep 25 12:28:20 2000
Received: (from tlambert@localhost)
	by usr02.primenet.com (8.8.5/8.8.5) id MAA29117;
	Mon, 25 Sep 2000 12:31:09 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <200009251931.MAA29117@usr02.primenet.com>
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files
To: arch@freebsd.org
Date: Mon, 25 Sep 2000 19:31:09 +0000 (GMT)
Cc: tlambert@primenet.com
In-Reply-To: <200009251659.JAA02227@vashon.polstra.com> from "John Polstra" at Sep 25, 2000 09:59:37 AM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> > With the suggested mutex recursion (please -- use a counting
> > semaphore, not a mutex, if you are going to permit recursion!),
> 
> Please explain why you think a counting semaphore has anything to do
> with recursion.  To support recursion a mutual exclusion primitive
> has to support the concept of ownership.  I.e., if you already own
> it you can acquire it recursively, but if somebody else owns it then
> you cannot.  A counting semaphore does not support that concept.  The
> count is not a recursion count at all.  Search google for "counting
> semaphore" and you'll find any number of introductory class notes on
> semaphores.  Or cut right to the chase and go to a typical one at
> 
> http://www.erc.msstate.edu/~ioana/POWERPOINT/CS4163/slides/Threads/tsld022.htm

Recursion should be such an exceptional condition that it should
be implemented with a seperate struct and a counting semaphore.

Counting semaphores have owners,; mutexes do not.  Therefore they
are a more appropriate primitive.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Sep 25 12:38:44 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from smtp05.primenet.com (smtp05.primenet.com [206.165.6.135])
	by hub.freebsd.org (Postfix) with ESMTP
	id BF13E37B424; Mon, 25 Sep 2000 12:38:33 -0700 (PDT)
Received: (from daemon@localhost)
	by smtp05.primenet.com (8.9.3/8.9.3) id MAA08146;
	Mon, 25 Sep 2000 12:38:48 -0700 (MST)
Received: from usr02.primenet.com(206.165.6.202)
 via SMTP by smtp05.primenet.com, id smtpdAAAbdaa0p; Mon Sep 25 12:38:42 2000
Received: (from tlambert@localhost)
	by usr02.primenet.com (8.8.5/8.8.5) id MAA29311;
	Mon, 25 Sep 2000 12:38:23 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <200009251938.MAA29311@usr02.primenet.com>
Subject: Re: Mutexes and semaphores
To: jhb@FreeBSD.ORG (John Baldwin)
Date: Mon, 25 Sep 2000 19:38:22 +0000 (GMT)
Cc: eischen@vigrid.com (Daniel Eischen), arch@FreeBSD.ORG
In-Reply-To: <XFMail.000925100353.jhb@FreeBSD.org> from "John Baldwin" at Sep 25, 2000 10:03:53 AM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> > But you can't then use a recursive mutex in conjunction with msleep
> > (cv_wait) which forces you to use yet another mutex.  This is fine,
> > but it adds confusion for the programmer.
> 
> This is a problem.  However, for one thing we currently have a
> KASSERT() that panic's if you msleep() on a recursed mutex.  However,
> one could also change msleep() to function like mi_switch() does
> with Giant and have it fully release the lock before sleeping, but
> this probably would not be a Good Thing.

No.  It would not be a good thing.

Consider that I may be sleeping on the acquisition of the third
out of three mutexes.

> > If we are going to support recursive mutex, I think it would be
> > better to add separate calls/macros/data types to support them,
> > so the the mtx mutexes can be simplified.  Calls to mtx_enter
> > with the recursive mutex type wouldn't even compile.
> 
> Err, the recursive nature of the mutexes is very trivial.  It
> doesn't affect the complexity of the mutexes at all.

Yes, it does.  Ownership precludes hand-off.  Recusrion support
implies permission and tacit approval.

A mutex is not recursive.  There are things you simply can not
implement when recursion is permitted for all of your primitives.

The most obvious argument is still that a mutex is intended to
protect data, not code.  Recursion is only required if the mutex
is actually protecting reentrancy of code, not access to data.

How would you implement vop_lookup() using a recusring mutex;
considering the ownership handoff which must occur?  You will
need a non-recursing mutex to protect yout recursing mutex
during the process of changing the owner (consider an ihash
reclaim during lookup, or ownership of a vnode mutex on a vnode
retrieved from the DNLC).


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Sep 25 12:40:31 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from wall.polstra.com (rtrwan160.accessone.com [206.213.115.74])
	by hub.freebsd.org (Postfix) with ESMTP id E0D4037B43C
	for <arch@freebsd.org>; Mon, 25 Sep 2000 12:40:24 -0700 (PDT)
Received: from vashon.polstra.com (vashon.polstra.com [206.213.73.13])
	by wall.polstra.com (8.9.3/8.9.3) with ESMTP id MAA17111;
	Mon, 25 Sep 2000 12:40:21 -0700 (PDT)
	(envelope-from jdp@polstra.com)
From: John Polstra <jdp@polstra.com>
Received: (from jdp@localhost)
	by vashon.polstra.com (8.9.3/8.9.1) id MAA02445;
	Mon, 25 Sep 2000 12:40:21 -0700 (PDT)
	(envelope-from jdp@polstra.com)
Date: Mon, 25 Sep 2000 12:40:21 -0700 (PDT)
Message-Id: <200009251940.MAA02445@vashon.polstra.com>
To: arch@freebsd.org
Reply-To: arch@freebsd.org
Cc: tlambert@primenet.com
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files
In-Reply-To: <200009251931.MAA29117@usr02.primenet.com>
References: <200009251931.MAA29117@usr02.primenet.com>
Organization: Polstra & Co., Seattle, WA
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

In article <200009251931.MAA29117@usr02.primenet.com>,
Terry Lambert  <tlambert@primenet.com> wrote:
> > Search google for "counting
> > semaphore" and you'll find any number of introductory class notes on
> > semaphores.  Or cut right to the chase and go to a typical one at
> > 
> > http://www.erc.msstate.edu/~ioana/POWERPOINT/CS4163/slides/Threads/tsld022.htm
> 
> Recursion should be such an exceptional condition that it should
> be implemented with a seperate struct and a counting semaphore.
> 
> Counting semaphores have owners,; mutexes do not.  Therefore they
> are a more appropriate primitive.

You are wrong.  Counting semaphores do not keep track of owners.
The count has nothing to do with that at all.  The count holds the
number of available "units" of whatever resource the semaphore is
controlling access to.  It is the number of "P" operations that can
be done without blocking.  That is completely different from the
recursion count of a recursive mutex, which keeps track of the number
of times the current owner has acquired the mutex, and therefore the
number of releases the owner must do before somebody else can acquire
the mutex.

Don't take my word for it.  Do the Google search as I suggested, or go
to the sample URL I gave you, or read any decent book or tutorial on
the subject.  Or, since you've cited Windows as having done it right,
read their documentation on semaphores.

John
-- 
  John Polstra                                               jdp@polstra.com
  John D. Polstra & Co., Inc.                        Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Sep 25 13: 6:26 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from smtp05.primenet.com (smtp05.primenet.com [206.165.6.135])
	by hub.freebsd.org (Postfix) with ESMTP id 54FA637B422
	for <arch@freebsd.org>; Mon, 25 Sep 2000 13:06:22 -0700 (PDT)
Received: (from daemon@localhost)
	by smtp05.primenet.com (8.9.3/8.9.3) id NAA18667;
	Mon, 25 Sep 2000 13:06:37 -0700 (MST)
Received: from usr02.primenet.com(206.165.6.202)
 via SMTP by smtp05.primenet.com, id smtpdAAA6QaWzK; Mon Sep 25 13:06:28 2000
Received: (from tlambert@localhost)
	by usr02.primenet.com (8.8.5/8.8.5) id NAA00200;
	Mon, 25 Sep 2000 13:06:09 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <200009252006.NAA00200@usr02.primenet.com>
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files
To: arch@freebsd.org
Date: Mon, 25 Sep 2000 20:06:09 +0000 (GMT)
Cc: tlambert@primenet.com
In-Reply-To: <200009251940.MAA02445@vashon.polstra.com> from "John Polstra" at Sep 25, 2000 12:40:21 PM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> > Recursion should be such an exceptional condition that it should
> > be implemented with a seperate struct and a counting semaphore.
> > 
> > Counting semaphores have owners,; mutexes do not.  Therefore they
> > are a more appropriate primitive.
> 
> You are wrong.  Counting semaphores do not keep track of owners.

OK.  Let's be pedantic.  Neither do mutexes.

Counting semaphores are a more appropriate primitive, as the
"resource" which is counted is the ownership capability.  As
others have pointed out (Archie, etc.), a semaphore with a
count of 1 is appropriate.  When the count goes 1->0, then
we can consider that ownership has been relinquished.


> The count has nothing to do with that at all.  The count holds the
> number of available "units" of whatever resource the semaphore is
> controlling access to.  It is the number of "P" operations that can
> be done without blocking.  That is completely different from the
> recursion count of a recursive mutex, which keeps track of the number
> of times the current owner has acquired the mutex, and therefore the
> number of releases the owner must do before somebody else can acquire
> the mutex.

I never stated that the recursion count would be implemented in
the semaphore count of a counting semaphore.  Please read the
first quoted sentence again.  Ownership and recursion are kept
in the seperate struct.


> Don't take my word for it.  Do the Google search as I suggested, or go
> to the sample URL I gave you, or read any decent book or tutorial on
> the subject.  Or, since you've cited Windows as having done it right,
> read their documentation on semaphores.

Windows did semaphores right.  Windows did mutexes wrong.  Like
idiots, they permitted recursion.  Since any user space thread
or timer can run on any kernel thread, and the mutex holder
is based on the kernel thread ID, not the higher level context
ID currently mapped to the thread, you can have situations
where you have a resource contended by two user space entities
mapped to a single kernel thread backing object (consider that
FreeBSD will act similarly with N:M threads).

To get around this, you have to implement non-recusing mutexes
using a semaphore of count 1.

Matt Day, Mark Muhlestein, and I ran into this when we implemented
the syncd as a timer outcall when we ported the Heidemann stacking
VFS framework to Windows 95, and implemented soft updates in FFS on
Windows 95, back in 1995-1996.

With recrusion permitted mutexes, people will find themselves
reinventing this for FreeBSD in order ot get non-recursing
mutexes for similar situations.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Sep 25 13:25: 0 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from wall.polstra.com (rtrwan160.accessone.com [206.213.115.74])
	by hub.freebsd.org (Postfix) with ESMTP id BF33337B422
	for <arch@freebsd.org>; Mon, 25 Sep 2000 13:24:57 -0700 (PDT)
Received: from vashon.polstra.com (vashon.polstra.com [206.213.73.13])
	by wall.polstra.com (8.9.3/8.9.3) with ESMTP id NAA17416;
	Mon, 25 Sep 2000 13:24:54 -0700 (PDT)
	(envelope-from jdp@polstra.com)
From: John Polstra <jdp@polstra.com>
Received: (from jdp@localhost)
	by vashon.polstra.com (8.9.3/8.9.1) id NAA02690;
	Mon, 25 Sep 2000 13:24:54 -0700 (PDT)
	(envelope-from jdp@polstra.com)
Date: Mon, 25 Sep 2000 13:24:54 -0700 (PDT)
Message-Id: <200009252024.NAA02690@vashon.polstra.com>
To: arch@freebsd.org
Reply-To: arch@freebsd.org
Cc: tlambert@primenet.com
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files
In-Reply-To: <200009252006.NAA00200@usr02.primenet.com>
References: <200009252006.NAA00200@usr02.primenet.com>
Organization: Polstra & Co., Seattle, WA
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

In article <200009252006.NAA00200@usr02.primenet.com>,
Terry Lambert  <tlambert@primenet.com> wrote:
> > You are wrong.  Counting semaphores do not keep track of owners.
> 
> OK.  Let's be pedantic.  Neither do mutexes.

I didn't say "mutex", I said "recursive mutex".  Recursive mutexes
do indeed keep track of their owners.

> Counting semaphores are a more appropriate primitive, as the
> "resource" which is counted is the ownership capability.  As
> others have pointed out (Archie, etc.), a semaphore with a
> count of 1 is appropriate.  When the count goes 1->0, then
> we can consider that ownership has been relinquished.

Actually, when the count goes 1->0, ownership has been acquired, not
relinquished.  The count represents the number of available units,
and that is the case in every definition and every implementation of
semaphores I have ever seen (which is quite a few, beginning in the
early 70's.).  It's even true in the rather baroque implementation of
semop(3).

> I never stated that the recursion count would be implemented in
> the semaphore count of a counting semaphore.  Please read the
> first quoted sentence again.  Ownership and recursion are kept
> in the seperate struct.

Fine, then you don't need a counting semaphore at all, as a simple
non-recursive mutex will do the same job just as well and more
efficiently.

> To get around this, you have to implement non-recusing mutexes
> using a semaphore of count 1.

A semaphore with a count of 1, when used for mutual exclusion,
behaves exactly the same as a simple mutex.  I don't understand why
you brought up counting semaphores at all.

John
-- 
  John Polstra                                               jdp@polstra.com
  John D. Polstra & Co., Inc.                        Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Sep 25 14:23:26 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from earth.backplane.com (placeholder-dcat-1076843290.broadbandoffice.net [64.47.83.26])
	by hub.freebsd.org (Postfix) with ESMTP id 43DAE37B42C
	for <arch@FreeBSD.ORG>; Mon, 25 Sep 2000 14:23:22 -0700 (PDT)
Received: (from dillon@localhost)
	by earth.backplane.com (8.11.0/8.9.3) id e8PLN5F84806;
	Mon, 25 Sep 2000 14:23:05 -0700 (PDT)
	(envelope-from dillon)
Date: Mon, 25 Sep 2000 14:23:05 -0700 (PDT)
From: Matt Dillon <dillon@earth.backplane.com>
Message-Id: <200009252123.e8PLN5F84806@earth.backplane.com>
To: Daniel Eischen <eischen@vigrid.com>
Cc: John Polstra <jdp@polstra.com>, arch@FreeBSD.ORG
Subject: Re: Mutexes and semaphores
References:  <Pine.SUN.3.91.1000925055843.15658A-100000@pcnet1.pcnet.com>
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

:
:Mutexes should protect data.  If you want to allow recursive ownership of
:data, then keep your own owner and ref count field in the protected data
:and use the mutex properly (release it after setting the owner or 
:incrementing the ref count).  You don't need to hold the mutex, and
:now you can use the same mutex for msleep/cv_wait.
:
:-- 
:Dan Eischen

    Mutexes protect data *CONSISTENCY*, not data.  There is a big difference.
    Probably 95% of the kernel assumes data consistency throughout any given
    routine.  If that routine must call other routines (and most do), then
    you have a major issue to contend with in regards to how to maintain
    consistency across the call.

    There are several ways to deal with it:

	* The subroutine calls are not allowed to block - lots of examples of
	  this in the VM and other subsystems.

	* You use a heavy-weight lock instead of a mutex - an example
	  of this would be the VFS subsystem (vnode locks).

	* You engineer the code to allow data to change out from under
	  it at certain points (such as when something blocks) - probably
	  the best example is vm_fault in the VM subsystem.

    Unfortunately, all but the first can lead to serious bugs.  Consider
    how many bugs have been fixed in the VFS and VM subsystems just in the
    last year that have been related to data consistency issues and you'll
    understand.

    The first issue - not allowing a subroutine call to block, when such a
    case exists, is the perfect place to put a recursive mutex.  If you don't
    use a recursive mutex at that point then you wind up having to 
    reengineer and rewrite big pieces of the code, or you wind up writing
    lots of little tag routines to do end-runs around the mutexes or to
    pass a flag that indicates that the mutex is already held and should
    not be obtained again, and so forth.  

    Remember, I'm not talking about subsystem A calling subsystem B here,
    I'm talking about subsystem A calling itself.  That is, a situation
    where you are not obtaining several different mutexes but are instead
    obtaining the same mutex several times.

    Frankly, fewer bugs will be introduced into the code by avoiding the
    reengineering and using recursive mutexes at appropriate points.

					-Matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Sep 25 14:39:24 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20])
	by hub.freebsd.org (Postfix) with ESMTP id 7AD7537B424
	for <arch@FreeBSD.ORG>; Mon, 25 Sep 2000 14:39:08 -0700 (PDT)
Received: (from bright@localhost)
	by fw.wintelcom.net (8.10.0/8.10.0) id e8PLcsH08288;
	Mon, 25 Sep 2000 14:38:54 -0700 (PDT)
Date: Mon, 25 Sep 2000 14:38:54 -0700
From: Alfred Perlstein <bright@wintelcom.net>
To: Matt Dillon <dillon@earth.backplane.com>
Cc: Daniel Eischen <eischen@vigrid.com>,
	John Polstra <jdp@polstra.com>, arch@FreeBSD.ORG
Subject: Re: Mutexes and semaphores
Message-ID: <20000925143853.J9141@fw.wintelcom.net>
References: <Pine.SUN.3.91.1000925055843.15658A-100000@pcnet1.pcnet.com> <200009252123.e8PLN5F84806@earth.backplane.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.4i
In-Reply-To: <200009252123.e8PLN5F84806@earth.backplane.com>; from dillon@earth.backplane.com on Mon, Sep 25, 2000 at 02:23:05PM -0700
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

* Matt Dillon <dillon@earth.backplane.com> [000925 14:23] wrote:
> :
> :Mutexes should protect data.  If you want to allow recursive ownership of
> :data, then keep your own owner and ref count field in the protected data
> :and use the mutex properly (release it after setting the owner or 
> :incrementing the ref count).  You don't need to hold the mutex, and
> :now you can use the same mutex for msleep/cv_wait.
> :
> :-- 
> :Dan Eischen
> 
>     Mutexes protect data *CONSISTENCY*, not data.  There is a big difference.
>     Probably 95% of the kernel assumes data consistency throughout any given
>     routine.  If that routine must call other routines (and most do), then
>     you have a major issue to contend with in regards to how to maintain
>     consistency across the call.
> 
>     There are several ways to deal with it:
> 
> 	* The subroutine calls are not allowed to block - lots of examples of
> 	  this in the VM and other subsystems.
> 
> 	* You use a heavy-weight lock instead of a mutex - an example
> 	  of this would be the VFS subsystem (vnode locks).
> 
> 	* You engineer the code to allow data to change out from under
> 	  it at certain points (such as when something blocks) - probably
> 	  the best example is vm_fault in the VM subsystem.
> 
>     Unfortunately, all but the first can lead to serious bugs.  Consider
>     how many bugs have been fixed in the VFS and VM subsystems just in the
>     last year that have been related to data consistency issues and you'll
>     understand.
> 
>     The first issue - not allowing a subroutine call to block, when such a
>     case exists, is the perfect place to put a recursive mutex.  If you don't
>     use a recursive mutex at that point then you wind up having to 
>     reengineer and rewrite big pieces of the code, or you wind up writing
>     lots of little tag routines to do end-runs around the mutexes or to
>     pass a flag that indicates that the mutex is already held and should
>     not be obtained again, and so forth.  
> 
>     Remember, I'm not talking about subsystem A calling subsystem B here,
>     I'm talking about subsystem A calling itself.  That is, a situation
>     where you are not obtaining several different mutexes but are instead
>     obtaining the same mutex several times.
> 
>     Frankly, fewer bugs will be introduced into the code by avoiding the
>     reengineering and using recursive mutexes at appropriate points.

What's pissing me off here (not to pick on you Matt) is that there's
honestly a lot of code to be worked on where the locking issues
are pretty simple (expecially when you look at how BSD/os implemented
it).

We should be coding and discussing existing problems with making
the kernel MPsafe instead of what me *might* come across along the
road.  Whatever we bump into we can always beat to a pulp using
lockmgr. :)

And honestly, I don't like the idea of recursive mutexes, I'd rather
have a super function that locks a pgrp like pg_signal_locked/_unlocked
which expects the locks to be held rather than a recursive lock.

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Sep 25 15:35:55 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from smtp02.primenet.com (smtp02.primenet.com [206.165.6.132])
	by hub.freebsd.org (Postfix) with ESMTP id 6D2AC37B424
	for <arch@freebsd.org>; Mon, 25 Sep 2000 15:35:52 -0700 (PDT)
Received: (from daemon@localhost)
	by smtp02.primenet.com (8.9.3/8.9.3) id PAA22718;
	Mon, 25 Sep 2000 15:32:57 -0700 (MST)
Received: from usr07.primenet.com(206.165.6.207)
 via SMTP by smtp02.primenet.com, id smtpdAAAc1aqhS; Mon Sep 25 15:32:38 2000
Received: (from tlambert@localhost)
	by usr07.primenet.com (8.8.5/8.8.5) id PAA07367;
	Mon, 25 Sep 2000 15:35:28 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <200009252235.PAA07367@usr07.primenet.com>
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files
To: arch@freebsd.org
Date: Mon, 25 Sep 2000 22:35:28 +0000 (GMT)
Cc: tlambert@primenet.com
In-Reply-To: <200009252024.NAA02690@vashon.polstra.com> from "John Polstra" at Sep 25, 2000 01:24:54 PM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> > Counting semaphores are a more appropriate primitive, as the
> > "resource" which is counted is the ownership capability.  As
> > others have pointed out (Archie, etc.), a semaphore with a
> > count of 1 is appropriate.  When the count goes 1->0, then
> > we can consider that ownership has been relinquished.
> 
> Actually, when the count goes 1->0, ownership has been acquired, not
> relinquished.  The count represents the number of available units,
> and that is the case in every definition and every implementation of
> semaphores I have ever seen (which is quite a few, beginning in the
> early 70's.).  It's even true in the rather baroque implementation of
> semop(3).

Remaining resources vs. acquired resources.  Same difference,
you knew what I meant, which is what mattered.


> Fine, then you don't need a counting semaphore at all, as a simple
> non-recursive mutex will do the same job just as well and more
> efficiently.

Fine.  Then we're agreed: non-recursive mutexes are the base
unit, and recursion will be implemented on a case by case basis
using an additional structure, which contains a non-recursive
mutex, a recursion counter, and an owner field.

Glad that's settled, until the first time a thread migrates between
processors, and we decide we need a semaphore instead of a mutex
as a primitive in order to handle sleeps and wakeups that occur
with a mutex with a recursion count greater than 0, for some ungodly
reason.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Sep 25 17:46:22 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from peach.ocn.ne.jp (peach.ocn.ne.jp [210.145.254.87])
	by hub.freebsd.org (Postfix) with ESMTP id E87F237B42C
	for <arch@FreeBSD.ORG>; Mon, 25 Sep 2000 17:46:17 -0700 (PDT)
Received: from newsguy.com (p27-dn03kiryunisiki.gunma.ocn.ne.jp [210.232.224.156])
	by peach.ocn.ne.jp (8.9.1a/OCN/) with ESMTP id JAA23910;
	Tue, 26 Sep 2000 09:46:14 +0900 (JST)
Message-ID: <39CFF19E.CD689985@newsguy.com>
Date: Tue, 26 Sep 2000 09:45:18 +0900
From: "Daniel C. Sobral" <dcs@newsguy.com>
X-Mailer: Mozilla 4.7 [en] (Win98; I)
X-Accept-Language: en,pt-BR
MIME-Version: 1.0
To: Terry Lambert <tlambert@primenet.com>
Cc: arch@FreeBSD.ORG
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files
References: <200009252006.NAA00200@usr02.primenet.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

Terry Lambert wrote:
> 
> With recrusion permitted mutexes, people will find themselves
> reinventing this for FreeBSD in order ot get non-recursing
> mutexes for similar situations.

Err... recursability is an option. With the present code, unless I
understood everything I heard so far completely wrong, you can have a
mutex act in either recursive or non-recursive ways.

-- 
Daniel C. Sobral			(8-DCS)
dcs@newsguy.com
dcs@freebsd.org
capo@the.secret.bsdconspiracy.net

	"I demand that my picture show a handsome face, even if it doesn't look
like me."


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Sep 25 19: 9:42 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from wall.polstra.com (rtrwan160.accessone.com [206.213.115.74])
	by hub.freebsd.org (Postfix) with ESMTP id B6FB937B42C
	for <arch@freebsd.org>; Mon, 25 Sep 2000 19:09:34 -0700 (PDT)
Received: from vashon.polstra.com (vashon.polstra.com [206.213.73.13])
	by wall.polstra.com (8.9.3/8.9.3) with ESMTP id TAA19160;
	Mon, 25 Sep 2000 19:09:29 -0700 (PDT)
	(envelope-from jdp@polstra.com)
From: John Polstra <jdp@polstra.com>
Received: (from jdp@localhost)
	by vashon.polstra.com (8.9.3/8.9.1) id TAA03815;
	Mon, 25 Sep 2000 19:09:28 -0700 (PDT)
	(envelope-from jdp@polstra.com)
Date: Mon, 25 Sep 2000 19:09:28 -0700 (PDT)
Message-Id: <200009260209.TAA03815@vashon.polstra.com>
To: arch@freebsd.org
Reply-To: arch@freebsd.org
Cc: tlambert@primenet.com
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files
In-Reply-To: <200009252235.PAA07367@usr07.primenet.com>
References: <200009252235.PAA07367@usr07.primenet.com>
Organization: Polstra & Co., Seattle, WA
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

In article <200009252235.PAA07367@usr07.primenet.com>, Terry Lambert
<tlambert@primenet.com> wrote:

> Fine.  Then we're agreed: non-recursive mutexes are the base unit,
> and recursion will be implemented on a case by case basis using
> an additional structure, which contains a non-recursive mutex, a
> recursion counter, and an owner field.

That's simply a less efficient implementation of a recursive mutex.
Why not use the real thing?

> Glad that's settled, until the first time a thread migrates between
> processors, and we decide we need a semaphore instead of a mutex as
> a primitive in order to handle sleeps and wakeups that occur with
> a mutex with a recursion count greater than 0, for some ungodly
> reason.

Now we're back practically to my original question.  Explain how a
semaphore is going to solve anything here.  I don't think it will
help one bit.  In virtually all cases which require sleeping and
being woken up (whether via a condition variable or a semaphore), the
basic scenario is the same.  Thread A is examining and/or modifying a
shared data structure.  Now he wants to wait until thread B modifies
the data structure and puts it into some desired state.  While A
was examining/modifying the data structure, he necessarily held a
mutex on it in order to get a consistent view.  Before he waits, he
must release that mutex -- otherwise B won't be able to make the
desired modifications.  This is true whether the waiting is done with
a condition variable or with a semaphore.  It really doesn't make much
difference which one you use.  The only difference is that when using
a condition variable the "release mutex and wait" sequence must be
atomic, because a condition variable doesn't "remember" a wakeup that
happened when nobody was waiting yet.  A semaphore does remember it,
so there is no need for atomicity with respect to releasing the mutex.
That's a pretty minor difference, and it doesn't have anything to do
with whether the mutexes are recursive or not.

If the mutex is recursively held, there is a problem in that some
other code grabbed the mutex and expected it to protect the data
structure from being changed underfoot.  Using a semaphore to do the
waiting doesn't solve that problem, or even address it.

John
-- 
  John Polstra                                               jdp@polstra.com
  John D. Polstra & Co., Inc.                        Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Sep 25 19:50:21 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from smtp04.primenet.com (smtp04.primenet.com [206.165.6.134])
	by hub.freebsd.org (Postfix) with ESMTP id 48E2C37B424
	for <arch@freebsd.org>; Mon, 25 Sep 2000 19:50:12 -0700 (PDT)
Received: (from daemon@localhost)
	by smtp04.primenet.com (8.9.3/8.9.3) id TAA15991;
	Mon, 25 Sep 2000 19:47:40 -0700 (MST)
Received: from usr05.primenet.com(206.165.6.205)
 via SMTP by smtp04.primenet.com, id smtpdAAA3daybF; Mon Sep 25 19:47:29 2000
Received: (from tlambert@localhost)
	by usr05.primenet.com (8.8.5/8.8.5) id TAA08391;
	Mon, 25 Sep 2000 19:49:58 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <200009260249.TAA08391@usr05.primenet.com>
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files
To: arch@freebsd.org
Date: Tue, 26 Sep 2000 02:49:58 +0000 (GMT)
Cc: tlambert@primenet.com
In-Reply-To: <200009260209.TAA03815@vashon.polstra.com> from "John Polstra" at Sep 25, 2000 07:09:28 PM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> > Fine.  Then we're agreed: non-recursive mutexes are the base unit,
> > and recursion will be implemented on a case by case basis using
> > an additional structure, which contains a non-recursive mutex, a
> > recursion counter, and an owner field.
> 
> That's simply a less efficient implementation of a recursive mutex.
> Why not use the real thing?

No we're back to my original question: where in the code is
there a perceived need for mutex recursion, or is this just
a case of the mutex code being bloated for no good reason?


> > Glad that's settled, until the first time a thread migrates between
> > processors, and we decide we need a semaphore instead of a mutex as
> > a primitive in order to handle sleeps and wakeups that occur with
> > a mutex with a recursion count greater than 0, for some ungodly
> > reason.
> 
> Now we're back practically to my original question.  Explain how a
> semaphore is going to solve anything here.  I don't think it will
> help one bit.  In virtually all cases which require sleeping and
> being woken up (whether via a condition variable or a semaphore), the
> basic scenario is the same.  Thread A is examining and/or modifying a
> shared data structure.  Now he wants to wait until thread B modifies
> the data structure and puts it into some desired state.  While A
> was examining/modifying the data structure, he necessarily held a
> mutex on it in order to get a consistent view.  Before he waits, he
> must release that mutex -- otherwise B won't be able to make the
> desired modifications.  This is true whether the waiting is done with
> a condition variable or with a semaphore.  It really doesn't make much
> difference which one you use.  The only difference is that when using
> a condition variable the "release mutex and wait" sequence must be
> atomic, because a condition variable doesn't "remember" a wakeup that
> happened when nobody was waiting yet.  A semaphore does remember it,
> so there is no need for atomicity with respect to releasing the mutex.
> That's a pretty minor difference, and it doesn't have anything to do
> with whether the mutexes are recursive or not.

No, it has to do with how long they are held.  If they are never
permitted to be held across recursive function calls -- or better,
across _ANY_ function calls -- then you can spin on the mutex,
instead of hoing to sleep.  So a mutex operation becomes:

	1)	Acquire mutex
	2)	Frob data protected by mutex
	3)	Release mutex

If someone else needs the same data, they do the same thing.

If you want to wait until a condition is true, then use a
condition variable, a semaphore, or something else you can wait
on in order to be signalled.

The idea that you should ever go to sleep waitind for a mutex
is antithetical to the very idea.

There are likewise, few real situation in which you require to
be able to hold two mutexes; these are degenerate cases, which
are badly coded.

Consider that I may have a vnode freelist protected by a mutex,
and a vnode protected by a mutex.  The perceived need to hold both
of these simultaneously to put something on the freelist is an
artifact of wrong-thinking: the pointer use to place a vnode on
a freelist are the property of the freelist mutex, not the vnode
mutex.

Even if you can make a case for this not being true (e.g. moving
a vnode from one list to another, using the same pointers in the
vnode to track state on both lists, which is really just an
acquire/remove/release/aquire/insert/release operation, where
you have a window between the removal and the reinsertion), it
can be handled by strictly controlling the order of operation on
mutex acquisition, and inverting the release order, and backing
off in case of conflict.


> If the mutex is recursively held, there is a problem in that some
> other code grabbed the mutex and expected it to protect the data
> structure from being changed underfoot.

Worst case, set an "IN_USE" flag on the data in a flags field to
bar reentry on a given data item.  Best case, fix the broken code.
The vnode locking code does this today (I'd argue that it's broken
code).


> Using a semaphore to do the
> waiting doesn't solve that problem, or even address it.

It does.  Semaphores can be held across a sleep (a wait).


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Tue Sep 26  4:30:32 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from pcnet1.pcnet.com (pcnet1.pcnet.com [204.213.232.3])
	by hub.freebsd.org (Postfix) with ESMTP id AE08B37B422
	for <arch@FreeBSD.ORG>; Tue, 26 Sep 2000 04:30:25 -0700 (PDT)
Received: (from eischen@localhost)
	by pcnet1.pcnet.com (8.8.7/PCNet) id HAA00572;
	Tue, 26 Sep 2000 07:30:05 -0400 (EDT)
Date: Tue, 26 Sep 2000 07:29:55 -0400 (EDT)
From: Daniel Eischen <eischen@vigrid.com>
To: Matt Dillon <dillon@earth.backplane.com>
Cc: John Polstra <jdp@polstra.com>, arch@FreeBSD.ORG
Subject: Re: Mutexes and semaphores
In-Reply-To: <200009252123.e8PLN5F84806@earth.backplane.com>
Message-ID: <Pine.SUN.3.91.1000926065812.26612A-100000@pcnet1.pcnet.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

On Mon, 25 Sep 2000, Matt Dillon wrote:
> :
> :Mutexes should protect data.  If you want to allow recursive ownership of
> :data, then keep your own owner and ref count field in the protected data
> :and use the mutex properly (release it after setting the owner or 
> :incrementing the ref count).  You don't need to hold the mutex, and
> :now you can use the same mutex for msleep/cv_wait.
> :
> :-- 
> :Dan Eischen
> 
>     Mutexes protect data *CONSISTENCY*, not data.  There is a big difference.
>     Probably 95% of the kernel assumes data consistency throughout any given
>     routine.  If that routine must call other routines (and most do), then
>     you have a major issue to contend with in regards to how to maintain
>     consistency across the call.
> 
>     There are several ways to deal with it:
> 
> 	* The subroutine calls are not allowed to block - lots of examples of
> 	  this in the VM and other subsystems.
> 
> 	* You use a heavy-weight lock instead of a mutex - an example
> 	  of this would be the VFS subsystem (vnode locks).
> 
> 	* You engineer the code to allow data to change out from under
> 	  it at certain points (such as when something blocks) - probably
> 	  the best example is vm_fault in the VM subsystem.
> 
>     Unfortunately, all but the first can lead to serious bugs.  Consider
>     how many bugs have been fixed in the VFS and VM subsystems just in the
>     last year that have been related to data consistency issues and you'll
>     understand.
> 
>     The first issue - not allowing a subroutine call to block, when such a
>     case exists, is the perfect place to put a recursive mutex.  If you don't
>     use a recursive mutex at that point then you wind up having to 
>     reengineer and rewrite big pieces of the code, or you wind up writing
>     lots of little tag routines to do end-runs around the mutexes or to
>     pass a flag that indicates that the mutex is already held and should
>     not be obtained again, and so forth.  
> 
>     Remember, I'm not talking about subsystem A calling subsystem B here,
>     I'm talking about subsystem A calling itself.  That is, a situation
>     where you are not obtaining several different mutexes but are instead
>     obtaining the same mutex several times.

If you absolutley need recursive mutexes, then roll your own and keep
the base mutex simple.  This is trivial to do and makes the base mutex
more efficient without the need to check for recursive ownership.

Mutexes should be held for very short amounts of time, and it
should be apparent in the encompassing code where the mutex is
taken and where it is released.  In your example, what do you
do in the case of abnormal exits from recursively called code?
It makes it far more easier to handle this situation if you roll
your own mutex and keep track of the ref count and owner yourself.  
If you don't, you'll end up adding mtx_exit_and_clear_refcount().

My main concern is not to eliminate recursive mutexes, though
I still think they should go.  I would like to see all barriers
to eliminating the flags/options to mtx_enter() and mtx_exit()
removed.  The current form of the mutex routines is not an API/ABI
we should be using.

-- 
Dan Eischen


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Tue Sep 26 18:10:17 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from fledge.watson.org (fledge.watson.org [204.156.12.50])
	by hub.freebsd.org (Postfix) with ESMTP
	id EDC6537B43C; Tue, 26 Sep 2000 18:10:04 -0700 (PDT)
Received: from fledge.watson.org (robert@fledge.pr.watson.org [192.0.2.3])
	by fledge.watson.org (8.9.3/8.9.3) with SMTP id VAA81368;
	Tue, 26 Sep 2000 21:09:58 -0400 (EDT)
	(envelope-from rwatson@FreeBSD.org)
Date: Tue, 26 Sep 2000 21:09:58 -0400 (EDT)
From: Robert Watson <rwatson@FreeBSD.org>
X-Sender: robert@fledge.watson.org
To: freebsd-fs@FreeBSD.org
Cc: freebsd-arch@FreeBSD.org, trustedbsd-discuss@TrustedBSD.org
Subject: VOP_ACCESS() and new VADMIN/VATTRIB?
Message-ID: <Pine.NEB.3.96L.1000926203644.79897G-100000@fledge.watson.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


(sorry about flagrant cross-posting -- wanted to make sure that those with
interest would have the opportunity to comment)

In general, access control for operations within a file system is
determined via a recursive VOP_ACCESS() call on the vnode, vis.

VOP_OPEN(vp, ...) -> ufs_open(vp, ...) -> VOP_ACCESS(vp, ...) ->
    ufs_access(vp, ...)

Flags are passed to VOP_ACCESS() indicating the specific requests being
made on the object, allowing VOP_ACCESS() to implement a variety of
discretionary and mandatory policies.  VOP_ACCESS(9) documents these flags
as VREAD, VWRITE, and VEXEC, reflecting respectively read, write, and
execute rights.  In recent changes to improve modularity and consistency,
Poul-Henning moved most of the mode/ownership-related components of
ufs_access() (and from other file systems) into vaccess().  File-system
specific components, such as the readonly status of the file system, and
UFS file flags, remain in ufs_access().

In the UFS code, VOP_ACCESS() is used fairly routinely to guard access to
the data associated with a file or directory.  However, there is an
additional class of requests relating to file operations wherein checks of
inode attributes and characteristics are performed directly, rather than
falling back on the central VOP_ACCESS() implementation for the file
system.  In general these requests relate to administrative actions for
the file: the ability to set protection rights for the file (ufs_chmod(),
ufs_chown(), and in the ACL implementation, also ufs_setacl()).  As a
result, these access checks are scattered through the file system
implementation, and do not lend themselves to further generalization.

I ran into this problem while implementing mandatory access control for
FreeBSD: mandatory policies override rights that may be granted by
discretionary mechanisms (such as permissions and ACLs), allowing
effective partitioning and segregation of the system based on other
properties, such as sensitivity and integrity labels.  One of example of
this is a Biba integrity policy, in which the permissions of a file might
allow write access to all users, but the MAC policy forbids this access as
it might violate system integrity (for example, incorrectly set
permissions on /kernel).  Without generalized and centralized access
control for all access decisions, it is difficult to cleanly inserts more
flexible access control policies.

I'd like to propose that an existing VADMIN flag be added determining
whether or not the passed credentials are permitted to administer the
file.  Here is a brief itemization of locations in the code where i->uid
checks would be replaced with VOP_ACCESS(vp, ... VADMIN ...) calls, with
some possible omissions:

File		Use
ufs_lookup.c	Allow owner of a sticky directory to delete any file in it
ufs_lookup.c	Allow owner of a file to delete it from a sticky directory
ufs_vnops.c	Allow owner of a file to set non-system file flags
ufs_vnops.c	Allow owner to modify times on file
ufs_vnops.c	Allow owner to modify permissions on file
ufs_vnops.c	Allow owner to modify group of file
ufs_vnops.c	Allow owner of a file or its parent directory to overwrite
		that file if its parent directory is sticky

There are some other references to i_uid in ufs_vnops.c relating to the
QUOTA and SUIDDIR code.  It is my belief, although I'd be glad to take
comments, that the QUOTA code should remain as is, as it's not for
access control but rather accounting.  Similarly, the SUIDDIR code should
remain as is, as it has to do with whether or not the ownership on a newly
created file should be set to reflect the parent directory's ownership
instead of the calling credential.

The effect of this change would be to allow any rights granted via
ownership of a file but not in the VREAD, VWRITE, and VEXEC catagories to
the new VADMIN category.  As a result, changes to the file system's
VOP_ACCESS() code could then grant or deny these requests based on other
factors in the credential, including mandatory policies.  I selected the
name VADMIN based on a similar right in the Andrew File System (AFS),
"admin", which permits users or groups with admin rights for a directory
to manipulate its access control list.  You could imagine adding a new
right such as this to the ACL implementation, although I have no plans to
do so at this point.

 Robert N M Watson 

robert@fledge.watson.org              http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Tue Sep 26 20:37:33 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id E96F237B424
	for <arch@FreeBSD.ORG>; Tue, 26 Sep 2000 20:37:23 -0700 (PDT)
Received: from sydney.worldwide.lemis.com (asbestos.linuxcare.com.au [203.17.0.30])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 7EA9D6E2BE9
	for <arch@FreeBSD.ORG>; Tue, 26 Sep 2000 20:37:12 -0700 (PDT)
Received: (from grog@localhost)
	by sydney.worldwide.lemis.com (8.9.3/8.9.3) id OAA08246;
	Wed, 27 Sep 2000 14:33:18 +1100 (EST)
	(envelope-from grog)
Date: Wed, 27 Sep 2000 14:33:18 +1100
From: Greg Lehey <grog@lemis.com>
To: Alfred Perlstein <bright@wintelcom.net>
Cc: Matt Dillon <dillon@earth.backplane.com>,
	Daniel Eischen <eischen@vigrid.com>, John Polstra <jdp@polstra.com>,
	arch@FreeBSD.ORG
Subject: Re: Mutexes and semaphores
Message-ID: <20000927143318.H7583@sydney.worldwide.lemis.com>
References: <Pine.SUN.3.91.1000925055843.15658A-100000@pcnet1.pcnet.com> <200009252123.e8PLN5F84806@earth.backplane.com> <20000925143853.J9141@fw.wintelcom.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2i
In-Reply-To: <20000925143853.J9141@fw.wintelcom.net>; from bright@wintelcom.net on Mon, Sep 25, 2000 at 02:38:54PM -0700
Organization: LEMIS, PO Box 460, Echunga SA 5153, Australia
Phone: +61-8-8388-8286
Fax: +61-8-8388-8725
Mobile: +61-418-838-708
WWW-Home-Page: http://www.lemis.com/~grog
X-PGP-Fingerprint: 6B 7B C3 8C 61 CD 54 AF  13 24 52 F8 6D A4 95 EF
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

On Monday, 25 September 2000 at 14:38:54 -0700, Alfred Perlstein wrote:
> * Matt Dillon <dillon@earth.backplane.com> [000925 14:23] wrote:
>>>
>>> Mutexes should protect data.  If you want to allow recursive ownership of
>>> data, then keep your own owner and ref count field in the protected data
>>> and use the mutex properly (release it after setting the owner or
>>> incrementing the ref count).  You don't need to hold the mutex, and
>>> now you can use the same mutex for msleep/cv_wait.
>>>
>>> --
>>> Dan Eischen
>>
>>     Mutexes protect data *CONSISTENCY*, not data.  There is a big difference.
>>     Probably 95% of the kernel assumes data consistency throughout any given
>>     routine.  If that routine must call other routines (and most do), then
>>     you have a major issue to contend with in regards to how to maintain
>>     consistency across the call.
>>
>>     There are several ways to deal with it:
>>
>> 	* The subroutine calls are not allowed to block - lots of examples of
>> 	  this in the VM and other subsystems.
>>
>> 	* You use a heavy-weight lock instead of a mutex - an example
>> 	  of this would be the VFS subsystem (vnode locks).
>>
>> 	* You engineer the code to allow data to change out from under
>> 	  it at certain points (such as when something blocks) - probably
>> 	  the best example is vm_fault in the VM subsystem.
>>
>>     Unfortunately, all but the first can lead to serious bugs.  Consider
>>     how many bugs have been fixed in the VFS and VM subsystems just in the
>>     last year that have been related to data consistency issues and you'll
>>     understand.
>>
>>     The first issue - not allowing a subroutine call to block, when such a
>>     case exists, is the perfect place to put a recursive mutex.  If you don't
>>     use a recursive mutex at that point then you wind up having to
>>     reengineer and rewrite big pieces of the code, or you wind up writing
>>     lots of little tag routines to do end-runs around the mutexes or to
>>     pass a flag that indicates that the mutex is already held and should
>>     not be obtained again, and so forth.
>>
>>     Remember, I'm not talking about subsystem A calling subsystem B here,
>>     I'm talking about subsystem A calling itself.  That is, a situation
>>     where you are not obtaining several different mutexes but are instead
>>     obtaining the same mutex several times.
>>
>>     Frankly, fewer bugs will be introduced into the code by avoiding the
>>     reengineering and using recursive mutexes at appropriate points.
>
> What's pissing me off here (not to pick on you Matt) is that there's
> honestly a lot of code to be worked on where the locking issues are
> pretty simple (expecially when you look at how BSD/os implemented
> it).

Hmm.  I was firmly in the "recursion is sloppiness" camp, but after
reading this thread I'm no longer so convinced.  I need to think about
it.  But showing examples where it makes sense doesn't mean it makes
sense everywhere, and I would at least say "unnecessary recursion is
sloppiness".  I think you're looking at the unnecessary cases.

> We should be coding and discussing existing problems with making the
> kernel MPsafe instead of what me *might* come across along the road.

I certainly think that at the moment we should be thinking about
structure rather than details.

> Whatever we bump into we can always beat to a pulp using lockmgr. :)

Well, can anybody put up good arguments for keeping lockmgr in the
long term?  I'm not saying there aren't any, but I haven't analysed it
enough yet.

> And honestly, I don't like the idea of recursive mutexes, I'd rather
> have a super function that locks a pgrp like
> pg_signal_locked/_unlocked which expects the locks to be held rather
> than a recursive lock.

I think that eliminating recursion requires you to understand the
system much better, which brings both advantages and disadvantages.

Greg
--
Finger grog@lemis.com for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Tue Sep 26 23: 4:51 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20])
	by hub.freebsd.org (Postfix) with ESMTP id 714F337B423
	for <arch@FreeBSD.ORG>; Tue, 26 Sep 2000 23:04:47 -0700 (PDT)
Received: (from bright@localhost)
	by fw.wintelcom.net (8.10.0/8.10.0) id e8R64bU05419;
	Tue, 26 Sep 2000 23:04:37 -0700 (PDT)
Date: Tue, 26 Sep 2000 23:04:37 -0700
From: Alfred Perlstein <bright@wintelcom.net>
To: Greg Lehey <grog@lemis.com>
Cc: Matt Dillon <dillon@earth.backplane.com>,
	Daniel Eischen <eischen@vigrid.com>, John Polstra <jdp@polstra.com>,
	arch@FreeBSD.ORG
Subject: Re: Mutexes and semaphores
Message-ID: <20000926230436.J9141@fw.wintelcom.net>
References: <Pine.SUN.3.91.1000925055843.15658A-100000@pcnet1.pcnet.com> <200009252123.e8PLN5F84806@earth.backplane.com> <20000925143853.J9141@fw.wintelcom.net> <20000927143318.H7583@sydney.worldwide.lemis.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.4i
In-Reply-To: <20000927143318.H7583@sydney.worldwide.lemis.com>; from grog@lemis.com on Wed, Sep 27, 2000 at 02:33:18PM +1100
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

* Greg Lehey <grog@lemis.com> [000926 20:34] wrote:
> On Monday, 25 September 2000 at 14:38:54 -0700, Alfred Perlstein wrote:
> 
> > We should be coding and discussing existing problems with making the
> > kernel MPsafe instead of what me *might* come across along the road.
> 
> I certainly think that at the moment we should be thinking about
> structure rather than details.

I think we've been doing that for two years already and it hasn't
bought us squat.

> > Whatever we bump into we can always beat to a pulp using lockmgr. :)
> 
> Well, can anybody put up good arguments for keeping lockmgr in the
> long term?  I'm not saying there aren't any, but I haven't analysed it
> enough yet.

lockmgr offers many styles of locking over a common lock interface,
it allows one to upgrade and downgrade a lock's read/write status
without loosing them and i'm pretty sure it also allows for recursion,
although that may be broken ATM.

> > And honestly, I don't like the idea of recursive mutexes, I'd rather
> > have a super function that locks a pgrp like
> > pg_signal_locked/_unlocked which expects the locks to be held rather
> > than a recursive lock.
> 
> I think that eliminating recursion requires you to understand the
> system much better, which brings both advantages and disadvantages.

Greg, after looking over this stuff for what seems like centuries
I can honestly say that with the exception of VFS and VM the system
is pretty straightforward (well proc/pgrp isn't much fun, but it's
not deadly).

If anyone has any doubts about what type of locking they'll need,
then they need to look at the BSD/os code, because they've
already done it!

What we need to be have is discussions about the way people plan
to push the locks in deeper, if it involves recursive mutexes,
conditional variables or green moon cheese, I really don't care so
as long as it's backed up by a real application for the primatives
that they want in our codebase.

Right now I can't even do getpid() properly because we don't have
read/write-barriers.

So far I like what I see in BSD/os, are we going to continue taking
advantage of the reference implementation we've been given or wander
off into nothingness?

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Tue Sep 26 23:13:14 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from InterJet.elischer.org (c421509-a.pinol1.sfba.home.com [24.7.86.9])
	by hub.freebsd.org (Postfix) with ESMTP
	id 703BF37B424; Tue, 26 Sep 2000 23:13:10 -0700 (PDT)
Received: from InterJet.elischer.org (InterJet.elischer.org [192.168.1.1])
	by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id XAA13214;
	Tue, 26 Sep 2000 23:12:38 -0700 (PDT)
Date: Tue, 26 Sep 2000 23:12:37 -0700 (PDT)
From: Julian Elischer <julian@elischer.org>
To: Robert Watson <rwatson@FreeBSD.org>
Cc: freebsd-fs@FreeBSD.org, freebsd-arch@FreeBSD.org,
	trustedbsd-discuss@TrustedBSD.org
Subject: Re: VOP_ACCESS() and new VADMIN/VATTRIB?
In-Reply-To: <Pine.NEB.3.96L.1000926203644.79897G-100000@fledge.watson.org>
Message-ID: <Pine.BSF.4.10.10009262311160.13148-100000@InterJet.elischer.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

I agree with all you have said here.


On Tue, 26 Sep 2000, Robert Watson wrote:

> 
> 
> In general, access control for operations within a file system is
> determined via a recursive VOP_ACCESS() call on the vnode, vis.
> 
> VOP_OPEN(vp, ...) -> ufs_open(vp, ...) -> VOP_ACCESS(vp, ...) ->
>     ufs_access(vp, ...)
[...]


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Wed Sep 27  0:16:36 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from smtp03.primenet.com (smtp03.primenet.com [206.165.6.133])
	by hub.freebsd.org (Postfix) with ESMTP id AC30437B422
	for <arch@FreeBSD.ORG>; Wed, 27 Sep 2000 00:16:34 -0700 (PDT)
Received: (from daemon@localhost)
	by smtp03.primenet.com (8.9.3/8.9.3) id AAA10796;
	Wed, 27 Sep 2000 00:15:07 -0700 (MST)
Received: from usr05.primenet.com(206.165.6.205)
 via SMTP by smtp03.primenet.com, id smtpdAAAX7aq6u; Wed Sep 27 00:14:58 2000
Received: (from tlambert@localhost)
	by usr05.primenet.com (8.8.5/8.8.5) id AAA20144;
	Wed, 27 Sep 2000 00:16:15 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <200009270716.AAA20144@usr05.primenet.com>
Subject: Re: Mutexes and semaphores
To: bright@wintelcom.net (Alfred Perlstein)
Date: Wed, 27 Sep 2000 07:16:15 +0000 (GMT)
Cc: grog@lemis.com (Greg Lehey),
	dillon@earth.backplane.com (Matt Dillon),
	eischen@vigrid.com (Daniel Eischen), jdp@polstra.com (John Polstra),
	arch@FreeBSD.ORG
In-Reply-To: <20000926230436.J9141@fw.wintelcom.net> from "Alfred Perlstein" at Sep 26, 2000 11:04:37 PM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

On behalf of Greg Lehey, whose server hates primenet, Alfred P is
attributed to have written:
> * Greg Lehey <grog@lemis.com> [000926 20:34] wrote:
> > On Monday, 25 September 2000 at 14:38:54 -0700, Alfred Perlstein wrote:
> > 
> > > We should be coding and discussing existing problems with making the
> > > kernel MPsafe instead of what me *might* come across along the road.
> > 
> > I certainly think that at the moment we should be thinking about
> > structure rather than details.
> 
> I think we've been doing that for two years already and it hasn't
> bought us squat.

Don't fabricate, Alfred.  The SMP code firwst existed as patches
by Jack Vogel, then of Sun Microsystems, against the October 27
1995 source tree.  The current SMP code is dervied from patches
(very minor ones) I did to bring Jack's work up to date in 1996,
and a lot of work by a lot of other people, starting with Peter.

So don't say it's been two years when it's really been five.

Thanks,

					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Wed Sep 27  0:21:36 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from mass.osd.bsdi.com (adsl-63-206-90-224.dsl.snfc21.pacbell.net [63.206.90.224])
	by hub.freebsd.org (Postfix) with ESMTP id 9780437B424
	for <arch@FreeBSD.ORG>; Wed, 27 Sep 2000 00:21:31 -0700 (PDT)
Received: from mass.osd.bsdi.com (localhost [127.0.0.1])
	by mass.osd.bsdi.com (8.11.0/8.9.3) with ESMTP id e8R7MkA03362;
	Wed, 27 Sep 2000 00:22:47 -0700 (PDT)
	(envelope-from msmith@mass.osd.bsdi.com)
Message-Id: <200009270722.e8R7MkA03362@mass.osd.bsdi.com>
X-Mailer: exmh version 2.1.1 10/15/1999
To: Terry Lambert <tlambert@primenet.com>
Cc: arch@FreeBSD.ORG
Subject: Re: Mutexes and semaphores 
In-reply-to: Your message of "Wed, 27 Sep 2000 07:16:15 -0000."
             <200009270716.AAA20144@usr05.primenet.com> 
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Wed, 27 Sep 2000 00:22:46 -0700
From: Mike Smith <msmith@freebsd.org>
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> > > > We should be coding and discussing existing problems with making the
> > > > kernel MPsafe instead of what me *might* come across along the road.
> > > 
> > > I certainly think that at the moment we should be thinking about
> > > structure rather than details.
> > 
> > I think we've been doing that for two years already and it hasn't
> > bought us squat.
> 
> Don't fabricate, Alfred.  The SMP code firwst existed as patches
> by Jack Vogel, then of Sun Microsystems, against the October 27
> 1995 source tree.  The current SMP code is dervied from patches
> (very minor ones) I did to bring Jack's work up to date in 1996,
> and a lot of work by a lot of other people, starting with Peter.
> 
> So don't say it's been two years when it's really been five.

Actually, it's been about a year, plus about four and a half of hot air.
I see and hear a lot of talk.  Who's doing the real work?  Do we see 
Peter, John or Tor, for example, in this windage competition. 8)

Come on folks.  Stick to the topic.


-- 
... every activity meets with opposition, everyone who acts has his
rivals and unfortunately opponents also.  But not because people want
to be opponents, rather because the tasks and relationships force
people to take different points of view.  [Dr. Fritz Todt]
           V I C T O R Y   N O T   V E N G E A N C E


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Wed Sep 27  0:23:59 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from smtp04.primenet.com (smtp04.primenet.com [206.165.6.134])
	by hub.freebsd.org (Postfix) with ESMTP
	id 6CE9F37B424; Wed, 27 Sep 2000 00:23:51 -0700 (PDT)
Received: (from daemon@localhost)
	by smtp04.primenet.com (8.9.3/8.9.3) id AAA02627;
	Wed, 27 Sep 2000 00:21:16 -0700 (MST)
Received: from usr05.primenet.com(206.165.6.205)
 via SMTP by smtp04.primenet.com, id smtpdAAA2Xa4bf; Wed Sep 27 00:21:10 2000
Received: (from tlambert@localhost)
	by usr05.primenet.com (8.8.5/8.8.5) id AAA20257;
	Wed, 27 Sep 2000 00:23:38 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <200009270723.AAA20257@usr05.primenet.com>
Subject: Re: VOP_ACCESS() and new VADMIN/VATTRIB?
To: julian@elischer.org (Julian Elischer)
Date: Wed, 27 Sep 2000 07:23:38 +0000 (GMT)
Cc: rwatson@FreeBSD.ORG (Robert Watson), freebsd-fs@FreeBSD.ORG,
	freebsd-arch@FreeBSD.ORG, trustedbsd-discuss@TrustedBSD.org
In-Reply-To: <Pine.BSF.4.10.10009262311160.13148-100000@InterJet.elischer.org> from "Julian Elischer" at Sep 26, 2000 11:12:37 PM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

Julian Elisher wrote:
> I agree with all you have said here.
> 
> On Tue, 26 Sep 2000, Robert Watson wrote:
> > In general, access control for operations within a file system is
> > determined via a recursive VOP_ACCESS() call on the vnode, vis.
> > 
> > VOP_OPEN(vp, ...) -> ufs_open(vp, ...) -> VOP_ACCESS(vp, ...) ->
> >     ufs_access(vp, ...)
> [...]

Perhaps a better question would be "assuming you generalize
the references cited using the orioised VADMIN, how many
references not using VOP_ACCES() will remain?".

I think the generalization and centralization which took
place are really bad things, since I think administrative
policy is something that I may very well want to set on
_both_ a system basis _and_ on a per-FS basis.

I also think that read-only-ness of an FS is a mount
option having nothing to do with the underlying FS itself.

It seems to me that some of the centralization should, in
fact, be backed out, since it seems that it would preclude
layer recursion in some useful stacking arrangements, much
in the same was a non-NULL VOP did when the "default" layer
was introduced (with no mechanism to provide default
semantics for nely defined VOPs, without a kernel recompile).


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Wed Sep 27  0:43: 9 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from InterJet.elischer.org (c421509-a.pinol1.sfba.home.com [24.7.86.9])
	by hub.freebsd.org (Postfix) with ESMTP
	id E8BE437B43F; Wed, 27 Sep 2000 00:43:07 -0700 (PDT)
Received: from InterJet.elischer.org (InterJet.elischer.org [192.168.1.1])
	by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id AAA13542;
	Wed, 27 Sep 2000 00:43:06 -0700 (PDT)
Date: Wed, 27 Sep 2000 00:43:05 -0700 (PDT)
From: Julian Elischer <julian@elischer.org>
To: Mike Smith <msmith@freebsd.org>
Cc: Terry Lambert <tlambert@primenet.com>, arch@FreeBSD.ORG
Subject: Re: Mutexes and semaphores 
In-Reply-To: <200009270722.e8R7MkA03362@mass.osd.bsdi.com>
Message-ID: <Pine.BSF.4.10.10009270022290.13148-100000@InterJet.elischer.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> 
> Actually, it's been about a year, plus about four and a half of hot air.
> I see and hear a lot of talk.  Who's doing the real work?  Do we see 
> Peter, John or Tor, for example, in this windage competition. 8)
> 
> Come on folks.  Stick to the topic.
> 


The point that was brought up a little while ago is more germaine to the
discussion:

Is there any documentation regarding the interaction between "lock
manager" and the mutexes? There appears now to be several different
sets of locking code in the kernel and I'm getting thoroughly confused in
my efforts to try 'catch up' with what's going on in the SMP 
world..


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Wed Sep 27  0:46:48 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from mass.osd.bsdi.com (adsl-63-206-90-224.dsl.snfc21.pacbell.net [63.206.90.224])
	by hub.freebsd.org (Postfix) with ESMTP id EF9E937B424
	for <arch@FreeBSD.ORG>; Wed, 27 Sep 2000 00:46:45 -0700 (PDT)
Received: from mass.osd.bsdi.com (localhost [127.0.0.1])
	by mass.osd.bsdi.com (8.11.0/8.9.3) with ESMTP id e8R7luA03450;
	Wed, 27 Sep 2000 00:47:56 -0700 (PDT)
	(envelope-from msmith@mass.osd.bsdi.com)
Message-Id: <200009270747.e8R7luA03450@mass.osd.bsdi.com>
X-Mailer: exmh version 2.1.1 10/15/1999
To: Julian Elischer <julian@elischer.org>
Cc: arch@FreeBSD.ORG
Subject: Re: Mutexes and semaphores 
In-reply-to: Your message of "Wed, 27 Sep 2000 00:43:05 PDT."
             <Pine.BSF.4.10.10009270022290.13148-100000@InterJet.elischer.org> 
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Wed, 27 Sep 2000 00:47:56 -0700
From: Mike Smith <msmith@freebsd.org>
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> The point that was brought up a little while ago is more germaine to the
> discussion:
> 
> Is there any documentation regarding the interaction between "lock
> manager" and the mutexes? There appears now to be several different
> sets of locking code in the kernel and I'm getting thoroughly confused in
> my efforts to try 'catch up' with what's going on in the SMP 
> world..

You're correct, it is germane.

The simple answer is that right now, there is essentially no 
documentation.  There needs to be focussed discussion on the topic you 
raise, and many others.

Note that the lock manager is responsible for broking relatively 
long-term locks on filesystem objects, whilst mutexes are used for 
protecting critical paths or data structures.  They're largely (but not 
entirely) orthagonal.  It would be to your advantage to look at how the 
BSD/OS code has been altered, to get a feel for one possible way of doing 
it.

-- 
... every activity meets with opposition, everyone who acts has his
rivals and unfortunately opponents also.  But not because people want
to be opponents, rather because the tasks and relationships force
people to take different points of view.  [Dr. Fritz Todt]
           V I C T O R Y   N O T   V E N G E A N C E


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Wed Sep 27  5:54: 1 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from fledge.watson.org (fledge.watson.org [204.156.12.50])
	by hub.freebsd.org (Postfix) with ESMTP
	id 0A77437B42C; Wed, 27 Sep 2000 05:53:39 -0700 (PDT)
Received: from fledge.watson.org (robert@fledge.pr.watson.org [192.0.2.3])
	by fledge.watson.org (8.9.3/8.9.3) with SMTP id IAA88998;
	Wed, 27 Sep 2000 08:53:06 -0400 (EDT)
	(envelope-from robert@fledge.watson.org)
Date: Wed, 27 Sep 2000 08:53:06 -0400 (EDT)
From: Robert Watson <rwatson@FreeBSD.ORG>
X-Sender: robert@fledge.watson.org
To: Terry Lambert <tlambert@primenet.com>
Cc: Julian Elischer <julian@elischer.org>, freebsd-fs@FreeBSD.ORG,
	freebsd-arch@FreeBSD.ORG, trustedbsd-discuss@TrustedBSD.org
Subject: Re: VOP_ACCESS() and new VADMIN/VATTRIB?
In-Reply-To: <200009270723.AAA20257@usr05.primenet.com>
Message-ID: <Pine.NEB.3.96L.1000927083240.88777C-100000@fledge.watson.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


On Wed, 27 Sep 2000, Terry Lambert wrote:

> Perhaps a better question would be "assuming you generalize
> the references cited using the orioised VADMIN, how many
> references not using VOP_ACCES() will remain?".

My goal was to identify the application of ownership righs on files and
directories (i.e., rights not granted by the discretionary permission maks
of ACL).  As it turns out, this class of checks maps extremely well into
the current use of ip-i_uid in the src/sys/ufs/ufs tree, resulting in very
few remaining references.  As I refered to, the remaining references
generally fall into two categories: first, the quota code which uses the
file uid to determine how to account for use (index into dqget()), to
determine when it should or should not report quota limit problems
(uprintf() to warn of quota conditions), and to determine whether the
current credential cr_uid matches the owner of the parent directory of a
newly created file when SUIDDIR is enabled.  In general, these are not
access control decisions, rather strict use of the cr_uid as an
identifier, meaning that abstraction of VADMIN as a category successfully
removes all remaining uid-based authorization code in UFS.

> I think the generalization and centralization which took
> place are really bad things, since I think administrative
> policy is something that I may very well want to set on
> _both_ a system basis _and_ on a per-FS basis.

I think there are both reasonable arguments for and against the
generalization in vaccess().  One important advantage of the
generalization is that it reduces the number of instances of
permission-based authorization checks, allowing easier auditing and
modification of the policy.  For example, when I introduced support for
POSIX.1e capabilities in my source tree, I needed only replace one
instance of suser() rather than dozens scattered through the source tree.
It also makes it easier to audit the use of privilege for correctness and
logging purposes if it can be centrally identified.  There is probably a
decent argument that vaccess(), while a good idea, does not have an API
lending itself to future expansion and flexibility: it directly accepts
file uid, gid, and mode fields, and does not have a policy-related
argument that could be used by the caller to specify how centralized
checking should apply in the context of the current file system.

> I also think that read-only-ness of an FS is a mount
> option having nothing to do with the underlying FS itself.

However, I think it is also arguable that the read-only-ness of a file
system is not a security property, but in some cases a media property.
That is to say, some file systems should be read-only by virtue of the
underlying storage medium or file system type.  Often, file systems are
mounted read-only for security reasons, which is "different".  vaccess()
abstracts only the generalized security decision, not determination of
per-file system or per-mount options.  I think it would be reasonable to
argue that we should attempt to distinguish security and non-security
mount options, and provide the file system an opportunity to pass the
security mount options to generalized security checking code, and that the
current single read-only flag does not distinguish the security and file
system properties that might be desirable.

That said, I think there's also an argument that you would only process
the read-only property centrally if you were willing to allow super-user
privilege to override that protection.  I.e., vaccess() performs
discretionary and mandatory access checks, with privilege allowing the
overriding of those protections.  If the protections should not be
overriden by appropriate privilege, they should not be processed as
security protections in vacess(), which would further distinguish
read-only mounting and a read-only security status.

> It seems to me that some of the centralization should, in
> fact, be backed out, since it seems that it would preclude
> layer recursion in some useful stacking arrangements, much
> in the same was a non-NULL VOP did when the "default" layer
> was introduced (with no mechanism to provide default
> semantics for nely defined VOPs, without a kernel recompile).

I'm not sure I follow this argument.  Each file system's VOP_ACCESS() 
implementation invokes vaccess() based on arguments it provides, and only
if it chooses.  For example, only file systems making use of a per-file
uid/gid/mode currently invoke vaccess().  Coda does not invoke it, and in
my ACLs tree, UFS doesn't invoke it, instead, vaccess_acl() in kern_acl.c.
vaccess() is not a default VOP, rather, a helper function for VOP_ACCESS()
implementations with common security properties.

  VOP_ACCESS() -> ufs_access() -> vaccess()

Given this description, do you believe there would be limits imposed on
stacked file system support?

  Robert N M Watson 

robert@fledge.watson.org              http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Wed Sep 27  8:29:39 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from sandman.sandgate.com (sandman.sandgate.com [38.161.139.2])
	by hub.freebsd.org (Postfix) with ESMTP id ACE8637B423
	for <freebsd-arch@FreeBSD.org>; Wed, 27 Sep 2000 08:29:24 -0700 (PDT)
Received: from vectra (a157.COMCAT.COM [207.86.230.157])
	by sandman.sandgate.com (8.10.0/8.10.0) with SMTP id e8RFTRx30267
	for <freebsd-arch@FreeBSD.org>; Wed, 27 Sep 2000 11:29:29 -0400 (EDT)
From: "Sue Wainer" <wainer@sandgate.com>
To: <freebsd-arch@FreeBSD.org>
Subject: Kernel configuration with new device drivers
Date: Wed, 27 Sep 2000 11:29:19 -0400
Message-ID: <NDBBLIBAPKIAHMINJJNFIELCCCAA.wainer@sandgate.com>
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_NextPart_000_0012_01C02876.2D0750C0"
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

This is a multi-part message in MIME format.

------=_NextPart_000_0012_01C02876.2D0750C0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit

What is the proper way to specify new driver source modules for a kernel
configuration?
E.g., the config manual page mentions /sys/i386/conf/files.ERNIE. How does
this file
get picked up when runing "config ERNIE"? Should files.i386 be modified to
include it?

------=_NextPart_000_0012_01C02876.2D0750C0
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content=3D"text/html; charset=3Diso-8859-1" =
http-equiv=3DContent-Type>
<META content=3D"MSHTML 5.00.2722.2800" name=3DGENERATOR></HEAD>
<BODY>
<DIV><FONT size=3D2><SPAN class=3D819315913-27092000>What is the proper =
way to=20
specify new driver source modules for a kernel=20
configuration?</SPAN></FONT></DIV>
<DIV><FONT size=3D2><SPAN class=3D819315913-27092000>E.g., the config =
manual page=20
mentions /sys/i386/conf/files.ERNIE. How does this =
file</SPAN></FONT></DIV>
<DIV><FONT size=3D2><SPAN class=3D819315913-27092000>get picked up when =
runing=20
"config ERNIE"? Should files.i386 be modified to include=20
it?</SPAN></FONT></DIV></BODY></HTML>

------=_NextPart_000_0012_01C02876.2D0750C0--


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Wed Sep 27  9:27:11 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20])
	by hub.freebsd.org (Postfix) with ESMTP id B137537B424
	for <freebsd-arch@FreeBSD.ORG>; Wed, 27 Sep 2000 09:27:07 -0700 (PDT)
Received: (from bright@localhost)
	by fw.wintelcom.net (8.10.0/8.10.0) id e8RGR0H19790;
	Wed, 27 Sep 2000 09:27:00 -0700 (PDT)
Date: Wed, 27 Sep 2000 09:27:00 -0700
From: Alfred Perlstein <bright@wintelcom.net>
To: Sue Wainer <wainer@sandgate.com>
Cc: freebsd-arch@FreeBSD.ORG
Subject: Re: Kernel configuration with new device drivers
Message-ID: <20000927092700.X9141@fw.wintelcom.net>
References: <NDBBLIBAPKIAHMINJJNFIELCCCAA.wainer@sandgate.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.4i
In-Reply-To: <NDBBLIBAPKIAHMINJJNFIELCCCAA.wainer@sandgate.com>; from wainer@sandgate.com on Wed, Sep 27, 2000 at 11:29:19AM -0400
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

* Sue Wainer <wainer@sandgate.com> [000927 08:29] wrote:
> What is the proper way to specify new driver source modules for a kernel
> configuration?
> E.g., the config manual page mentions /sys/i386/conf/files.ERNIE. How does
> this file
> get picked up when runing "config ERNIE"? Should files.i386 be modified to
> include it?

If it's archetecture neutral you want to use:
/usr/src/sys/conf/files

If it's i386 specific you want to use:
/usr/src/sys/conf/files.i386

The reason for the /sys/i386/conf/files.ERNIE file is so you can have
a local modification without it being wiped out by cvsup.

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Wed Sep 27 11:54:38 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from wall.polstra.com (rtrwan160.accessone.com [206.213.115.74])
	by hub.freebsd.org (Postfix) with ESMTP id 5A7BA37B422
	for <arch@freebsd.org>; Wed, 27 Sep 2000 11:53:56 -0700 (PDT)
Received: from vashon.polstra.com (vashon.polstra.com [206.213.73.13])
	by wall.polstra.com (8.9.3/8.9.3) with ESMTP id LAA27778;
	Wed, 27 Sep 2000 11:51:54 -0700 (PDT)
	(envelope-from jdp@polstra.com)
From: John Polstra <jdp@polstra.com>
Received: (from jdp@localhost)
	by vashon.polstra.com (8.9.3/8.9.1) id LAA07258;
	Wed, 27 Sep 2000 11:51:54 -0700 (PDT)
	(envelope-from jdp@polstra.com)
Date: Wed, 27 Sep 2000 11:51:54 -0700 (PDT)
Message-Id: <200009271851.LAA07258@vashon.polstra.com>
To: arch@freebsd.org
Reply-To: arch@freebsd.org
Cc: tlambert@primenet.com
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files
In-Reply-To: <200009260249.TAA08391@usr05.primenet.com>
References: <200009260249.TAA08391@usr05.primenet.com>
Organization: Polstra & Co., Seattle, WA
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

In article <200009260249.TAA08391@usr05.primenet.com>,
Terry Lambert  <tlambert@primenet.com> wrote:
> > 
> > That's simply a less efficient implementation of a recursive mutex.
> > Why not use the real thing?
> 
> No we're back to my original question:
> where in the code is there a perceived need for mutex recursion,

You'll have to ask somebody else.  I never said I knew of any.
A couple of folks said recursive mutexes are evil, and I said I
disagreed, and explained why.

> or is this just a case of the mutex code being bloated for no good
> reason?

No need to bloat it.  It could be a different data type for all I
care.

> > That's a pretty minor difference, and it doesn't have anything to
> > do with whether the mutexes are recursive or not.
>
> No, it has to do with how long they are held.  If they are never
> permitted to be held across recursive function calls -- or better,
> across _ANY_ function calls -- then you can spin on the mutex,
> instead of hoing to sleep.

With a mutex or a semaphore, you can spin or you can go to sleep
or you can spin for awhile and then go to sleep.  That's just an
implementation detail.

Your desire that a mutex not be held across any function calls seems
arbitrary and would lead to unstructured code.  It's perfectly
legitimate for frobbing the data to involve one or more function
calls.  It isn't always easy to frob. :-)

> So a mutex operation becomes:
> 
> 	1)	Acquire mutex
> 	2)	Frob data protected by mutex
> 	3)	Release mutex
> 
> If someone else needs the same data, they do the same thing.
> 
> If you want to wait until a condition is true, then use a
> condition variable, a semaphore, or something else you can wait
> on in order to be signalled.

I have no argument with any of that.

> The idea that you should ever go to sleep waitind for a mutex is
> antithetical to the very idea.

Are you assuming that mutexes spin but semaphores sleep?  You haven't
actually said so explicitly, but I'm starting to think that's your
assumption.

> There are likewise, few real situation in which you require to be
> able to hold two mutexes; these are degenerate cases, which are
> badly coded.

Well, I wouldn't put it that strongly.  Sometimes you have to maintain
two independent data structures in a consistent manner, and that
involves locking both of them at once.

> Consider that I may have a vnode freelist protected by a mutex, and
> a vnode protected by a mutex.  The perceived need to hold both of
> these simultaneously to put something on the freelist is an artifact
> of wrong-thinking: the pointer use to place a vnode on a freelist
> are the property of the freelist mutex, not the vnode mutex.

Agreed.

> Even if you can make a case for this not being true (e.g. moving
> a vnode from one list to another, using the same pointers in
> the vnode to track state on both lists, which is really just an
> acquire/remove/release/aquire/insert/release operation, where you
> have a window between the removal and the reinsertion), it can be
> handled by strictly controlling the order of operation on mutex
> acquisition, and inverting the release order, and backing off in
> case of conflict.

Yes, that's the standard way of avoiding deadlock.  Though as far as
I can see, the release order doesn't actually matter, since releasing
never blocks anybody.

> > If the mutex is recursively held, there is a problem in that some
> > other code grabbed the mutex and expected it to protect the data
> > structure from being changed underfoot.
>
> Worst case, set an "IN_USE" flag on the data in a flags field to
> bar reentry on a given data item.  Best case, fix the broken code.
> The vnode locking code does this today (I'd argue that it's broken
> code).

Well, we are dealing with a lot of legacy code that was never designed
with threads in mind.  I personally believe that the recursive mutex
is a reasonable primitive to deal with it, particularly during the
transition phase.

> > Using a semaphore to do the waiting doesn't solve that problem, or
> > even address it.
>
> It does.  Semaphores can be held across a sleep (a wait).

You must be assuming that mutexes always spin and semaphores don't.  I
don't agree with that assumption, but at least it would explain why we
can't seem to communicate effectively on this topic.

John
-- 
  John Polstra                                               jdp@polstra.com
  John D. Polstra & Co., Inc.                        Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Wed Sep 27 12: 9:12 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from wall.polstra.com (rtrwan160.accessone.com [206.213.115.74])
	by hub.freebsd.org (Postfix) with ESMTP id 98EBA37B424
	for <arch@freebsd.org>; Wed, 27 Sep 2000 12:09:09 -0700 (PDT)
Received: from vashon.polstra.com (vashon.polstra.com [206.213.73.13])
	by wall.polstra.com (8.9.3/8.9.3) with ESMTP id MAA27884;
	Wed, 27 Sep 2000 12:09:01 -0700 (PDT)
	(envelope-from jdp@polstra.com)
From: John Polstra <jdp@polstra.com>
Received: (from jdp@localhost)
	by vashon.polstra.com (8.9.3/8.9.1) id MAA07294;
	Wed, 27 Sep 2000 12:09:01 -0700 (PDT)
	(envelope-from jdp@polstra.com)
Date: Wed, 27 Sep 2000 12:09:01 -0700 (PDT)
Message-Id: <200009271909.MAA07294@vashon.polstra.com>
To: arch@freebsd.org
Reply-To: arch@freebsd.org
Cc: eischen@vigrid.com
Subject: Re: Mutexes and semaphores
In-Reply-To: <Pine.SUN.3.91.1000926065812.26612A-100000@pcnet1.pcnet.com>
References: <Pine.SUN.3.91.1000926065812.26612A-100000@pcnet1.pcnet.com>
Organization: Polstra & Co., Seattle, WA
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

In article
<Pine.SUN.3.91.1000926065812.26612A-100000@pcnet1.pcnet.com>, Daniel
Eischen <eischen@vigrid.com> wrote:

> If you absolutley need recursive mutexes, then roll your own and
> keep the base mutex simple.  This is trivial to do and makes the
> base mutex more efficient without the need to check for recursive
> ownership.

I think it would make sense to make recursive mutexes a separate
type, so they don't complicate the non-recursive ones.  But the "roll
your own" idea would work against eventually getting rid of recursive
mutexes entirely.  If they are implemented ad hoc in various places,
it will be hard to find them all later.  Better to have a standard
implementation that's easy to search for.

> Mutexes should be held for very short amounts of time, and it should
> be apparent in the encompassing code where the mutex is taken and
> where it is released.  In your example, what do you do in the case
> of abnormal exits from recursively called code?  It makes it far
> more easier to handle this situation if you roll your own mutex
> and keep track of the ref count and owner yourself.  If you don't,
> you'll end up adding mtx_exit_and_clear_refcount().

Yes, that's a good point.

> My main concern is not to eliminate recursive mutexes, though I
> still think they should go.  I would like to see all barriers to
> eliminating the flags/options to mtx_enter() and mtx_exit() removed.
> The current form of the mutex routines is not an API/ABI we should
> be using.

I'm not too thrilled with the API myself.

John
-- 
  John Polstra                                               jdp@polstra.com
  John D. Polstra & Co., Inc.                        Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Wed Sep 27 12:31:23 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20])
	by hub.freebsd.org (Postfix) with ESMTP id F051D37B423
	for <arch@FreeBSD.ORG>; Wed, 27 Sep 2000 12:31:07 -0700 (PDT)
Received: (from bright@localhost)
	by fw.wintelcom.net (8.10.0/8.10.0) id e8RJV7j25841
	for arch@FreeBSD.ORG; Wed, 27 Sep 2000 12:31:07 -0700 (PDT)
Date: Wed, 27 Sep 2000 12:31:07 -0700
From: Alfred Perlstein <bright@wintelcom.net>
To: arch@FreeBSD.ORG
Subject: Re: Mutexes and semaphores
Message-ID: <20000927123107.A9141@fw.wintelcom.net>
References: <Pine.SUN.3.91.1000926065812.26612A-100000@pcnet1.pcnet.com> <200009271909.MAA07294@vashon.polstra.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.4i
In-Reply-To: <200009271909.MAA07294@vashon.polstra.com>; from jdp@polstra.com on Wed, Sep 27, 2000 at 12:09:01PM -0700
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

* John Polstra <jdp@polstra.com> [000927 12:09] wrote:
> In article
> <Pine.SUN.3.91.1000926065812.26612A-100000@pcnet1.pcnet.com>, Daniel
> Eischen <eischen@vigrid.com> wrote:
> 
> > If you absolutley need recursive mutexes, then roll your own and
> > keep the base mutex simple.  This is trivial to do and makes the
> > base mutex more efficient without the need to check for recursive
> > ownership.
> 
> I think it would make sense to make recursive mutexes a separate
> type, so they don't complicate the non-recursive ones.  But the "roll
> your own" idea would work against eventually getting rid of recursive
> mutexes entirely.  If they are implemented ad hoc in various places,
> it will be hard to find them all later.  Better to have a standard
> implementation that's easy to search for.

As I said earlier, when you find some code that really needs one
in order to make a subsystem you're working on mpsafe we'll have
a short discussion to make sure it's really needed and if it is,
then we'll do it.

Right now there's no point in this discussion.

> 
> I'm not too thrilled with the API myself.

I think you should begin to use it before hating it, I didn't like it
at first, but it's certainly usable.

-Alfred


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Wed Sep 27 14: 0:42 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from pcnet1.pcnet.com (pcnet1.pcnet.com [204.213.232.3])
	by hub.freebsd.org (Postfix) with ESMTP id A441237B43E
	for <arch@freebsd.org>; Wed, 27 Sep 2000 14:00:38 -0700 (PDT)
Received: (from eischen@localhost)
	by pcnet1.pcnet.com (8.8.7/PCNet) id RAA00068;
	Wed, 27 Sep 2000 17:00:21 -0400 (EDT)
Date: Wed, 27 Sep 2000 17:00:20 -0400 (EDT)
From: Daniel Eischen <eischen@vigrid.com>
To: arch@freebsd.org
Cc: arch@freebsd.org
Subject: Re: Mutexes and semaphores
In-Reply-To: <200009271909.MAA07294@vashon.polstra.com>
Message-ID: <Pine.SUN.3.91.1000927163448.26328A-100000@pcnet1.pcnet.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

On Wed, 27 Sep 2000, John Polstra wrote:
> In article
> <Pine.SUN.3.91.1000926065812.26612A-100000@pcnet1.pcnet.com>, Daniel
> Eischen <eischen@vigrid.com> wrote:
> 
> > If you absolutley need recursive mutexes, then roll your own and
> > keep the base mutex simple.  This is trivial to do and makes the
> > base mutex more efficient without the need to check for recursive
> > ownership.
> 
> I think it would make sense to make recursive mutexes a separate
> type, so they don't complicate the non-recursive ones.  But the "roll
> your own" idea would work against eventually getting rid of recursive
> mutexes entirely.  If they are implemented ad hoc in various places,
> it will be hard to find them all later.  Better to have a standard
> implementation that's easy to search for.

I'll agree to this; I've suggested it before.  But I'd like to go
one step further and not make them part of our official API.  State
that they are subject to change/removal, perhaps complain loudly
when compiled with -DKLD_API (-DKLD_MODULE ?) or something.

-- 
Dan Eischen


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Wed Sep 27 14:52:45 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from smtp05.primenet.com (smtp05.primenet.com [206.165.6.135])
	by hub.freebsd.org (Postfix) with ESMTP id 8C15137B422
	for <arch@freebsd.org>; Wed, 27 Sep 2000 14:52:24 -0700 (PDT)
Received: (from daemon@localhost)
	by smtp05.primenet.com (8.9.3/8.9.3) id OAA03115;
	Wed, 27 Sep 2000 14:52:39 -0700 (MST)
Received: from usr02.primenet.com(206.165.6.202)
 via SMTP by smtp05.primenet.com, id smtpdAAA3AaG3f; Wed Sep 27 14:52:26 2000
Received: (from tlambert@localhost)
	by usr02.primenet.com (8.8.5/8.8.5) id OAA27175;
	Wed, 27 Sep 2000 14:52:07 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <200009272152.OAA27175@usr02.primenet.com>
Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files
To: arch@freebsd.org
Date: Wed, 27 Sep 2000 21:52:06 +0000 (GMT)
Cc: tlambert@primenet.com
In-Reply-To: <200009271851.LAA07258@vashon.polstra.com> from "John Polstra" at Sep 27, 2000 11:51:54 AM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

John Polstra writes:
> Terry Lambert  <tlambert@primenet.com> wrote:
> > The idea that you should ever go to sleep waiting for a mutex is
> > antithetical to the very idea.
> 
> Are you assuming that mutexes spin but semaphores sleep?  You haven't
> actually said so explicitly, but I'm starting to think that's your
> assumption.

[ ... ]

> > It does.  Semaphores can be held across a sleep (a wait).
> 
> You must be assuming that mutexes always spin and semaphores don't.  I
> don't agree with that assumption, but at least it would explain why we
> can't seem to communicate effectively on this topic.

Yes, this is exactly my assumption.  Here's why:

You have to ask "why do I use a mutex?"; the answer is "to protect
data".

Then you have to ask "why do I need to protect data?"; the answer
is more complicated, but boils down to "to prevent concurrent
access in various and sundry situations which may arise".

To my mind, there are only four cases which may result in an
attempt at concurrent access:

1)	SMP; concurrent access is attempted by another
	processor

2)	Kernel reentrancy as a result of an excpetion (e.g.
	a page fault)

3)	Kernel reentrancy as a result of an interrupt

4)	Kernel preemption, as part of a Real Time subsystem


In #1, it is not only acceptable to spin, it is preferred, since
we know that (A) a mutex is only held for a very short period of
time, and (B) we don't want to pay the penalty for going to
sleep, since we are talking about stalling a single processor for
a very short period of time, and we may have 8 processors, 7 of
which would have to pay the penalty for a signal delivery, if a
wakeup occurred.

In #2 and #3, the degenerate cases are identical.  One could
conceive of a shared PCI interrupt being handled by different
processors, given "bottom-end single threading" and "top-end
multithreading".  But devices own their own resources; only
if we were able to multithread our "bottom-end", the actual
hardware event handler that does nothing other than handle the
hardware event to the point that it is no longer needed (this
implies reenabling the interrupt outside of interrupt context),
would there be a contented resource issue to resolve.  Further,
when operating in interupt or exception context, we are running
in the context that was active at the time of the event.  This
bodes ill for recursive mutex acquisition, since it means that
we could change data out from under the active context, despite
it holding the mutex.  Thus we can not use a mutex to contend
resource between interrupt, exception, and normal contexts, if
by ownership, we mean one of the three.  NT resolves this by
turning interrupt and exception contexts into heavy-weight
contexts, on a par with a process (different page mapping, etc.).
If FreeBSD does not do the same, then there MUST be no resource
contention between these domains.  If it does the same, then a
modification is required: you still spin, but you do an explicit
yield to the mutex holder during each cycle through the spin; so
we are left needing an owner that is unique between all contexts,
not simply between kernel threads.  This would only end up being
useful on systems where the I/O bus was never contended between
drivers in a non-transparent way (e.g. all data movement is
based on bus mastered DMA, with hardware contention, and interrupt
signalling by the host when a DMA should be started, and by
the device, when host processing should be started).

In #4, we have the need to sleep, since it is possible that you
will be put to sleep involuntarily while holding a mutex.  The
easiest way to handle this is to merely delay the sleep until
all mutexes held by the unique owner have been relinquished.
This implies a condition variable in the process structure,
an overall mutex hold count, and another mutex to protect the
hold count and the condition variable, associated with the
context structure unique to the identity "owner".  This will
permit priority lending that lasts only for the duration of the
held contended resource(s).  Use of this complicates matters,
but the benefit of RT support that would come with it is high,
if RT is what floats your boat.  A system utilizing this
approach could be conditionally compiled as "RT" or "non-RT",
using macro substitution, so the hit need not be taken in a
"GENERIC" kernel.

- So for the most part, unless we are implementing RT and a
separate context for each potential concurrent interrupt to
the host, and each concurrent exceptional condition (I see
a need for a minimum of 2, for the F00F bug, and for 386
kernel page write fault processing), mutexes should spin.


> > Even if you can make a case for this not being true (e.g. moving
> > a vnode from one list to another, using the same pointers in
> > the vnode to track state on both lists, which is really just an
> > acquire/remove/release/aquire/insert/release operation, where you
> > have a window between the removal and the reinsertion), it can be
> > handled by strictly controlling the order of operation on mutex
> > acquisition, and inverting the release order, and backing off in
> > case of conflict.
> 
> Yes, that's the standard way of avoiding deadlock.  Though as far as
> I can see, the release order doesn't actually matter, since releasing
> never blocks anybody.

The inversion is "acquire A/acquire B/process/release A/release B"
ensures that the acquisition can occur concurrently in a forward
path, in the fact of interrupt/exception/kernel preemption.  It
prevents a starvation deadlock.

In the RT priority lending case (to support kernel preemption),
it's possible to have a deadly embrace deadlock without this, as
well, based on a low priority task "hogging" a conteded resource
using a two mutex strategy, and therefore raising its own priority
(this isn't a security issue, since user space mutex code is not
an externalization of kernel space mutex code).  By permitting the
lending context to acquire A and go back into the spin/yield loop
for resource B, you preclude this.

For constructs like:

	acquire A/diddle A/release A
	acquire B/diddle B/release B
	acquire A/diddle A/release A

You would have to recode them as:

	acquire A/diddle A
	acquire B/diddle B
	diddle A/release A
	release B

As previously pointed out, this is most likely with list
manipulation involving variables shared between lists that
are protected by different mutexes.


> > > If the mutex is recursively held, there is a problem in that some
> > > other code grabbed the mutex and expected it to protect the data
> > > structure from being changed underfoot.
> >
> > Worst case, set an "IN_USE" flag on the data in a flags field to
> > bar reentry on a given data item.  Best case, fix the broken code.
> > The vnode locking code does this today (I'd argue that it's broken
> > code).
> 
> Well, we are dealing with a lot of legacy code that was never designed
> with threads in mind.  I personally believe that the recursive mutex
> is a reasonable primitive to deal with it, particularly during the
> transition phase.

Well, we have this flag _now_ in the vnode code, so it's not
like anything is being saved by not using it.  I really don't
see much code that has this problem.  There's the scheduler,
the VM system, and some process structure stuff having to do
with fork/exec/_exit, but that seems to be it, after the vnode
cruft.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Wed Sep 27 15: 0:50 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from smtp05.primenet.com (smtp05.primenet.com [206.165.6.135])
	by hub.freebsd.org (Postfix) with ESMTP id 285AF37B42C
	for <arch@FreeBSD.ORG>; Wed, 27 Sep 2000 15:00:46 -0700 (PDT)
Received: (from daemon@localhost)
	by smtp05.primenet.com (8.9.3/8.9.3) id PAA07262;
	Wed, 27 Sep 2000 15:01:03 -0700 (MST)
Received: from usr02.primenet.com(206.165.6.202)
 via SMTP by smtp05.primenet.com, id smtpdAAARCaaZn; Wed Sep 27 15:00:42 2000
Received: (from tlambert@localhost)
	by usr02.primenet.com (8.8.5/8.8.5) id PAA27464;
	Wed, 27 Sep 2000 15:00:22 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <200009272200.PAA27464@usr02.primenet.com>
Subject: Re: Mutexes and semaphores
To: eischen@vigrid.com (Daniel Eischen)
Date: Wed, 27 Sep 2000 22:00:22 +0000 (GMT)
Cc: arch@FreeBSD.ORG
In-Reply-To: <Pine.SUN.3.91.1000927163448.26328A-100000@pcnet1.pcnet.com> from "Daniel Eischen" at Sep 27, 2000 05:00:20 PM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

Dan Eischen wrote:
> On Wed, 27 Sep 2000, John Polstra wrote:
> > > If you absolutley need recursive mutexes, then roll your own and
> > > keep the base mutex simple.  This is trivial to do and makes the
> > > base mutex more efficient without the need to check for recursive
> > > ownership.
> > 
> > I think it would make sense to make recursive mutexes a separate
> > type, so they don't complicate the non-recursive ones.  But the "roll
> > your own" idea would work against eventually getting rid of recursive
> > mutexes entirely.  If they are implemented ad hoc in various places,
> > it will be hard to find them all later.  Better to have a standard
> > implementation that's easy to search for.
> 
> I'll agree to this; I've suggested it before.  But I'd like to go
> one step further and not make them part of our official API.  State
> that they are subject to change/removal, perhaps complain loudly
> when compiled with -DKLD_API (-DKLD_MODULE ?) or something.

I'll third this approach.  I personally don't see where they would
be useful, but de-grunging the API argument list and putting them
in a seperate API (mutex_legacy()?) would be vastly preferrable.

I like the idea of being able to grep for "mutex_legacy" to be
able to distinguish between code that has been hacked for legacy
reasons vs. code that has been intentionally made SMP safe.

I'd like to see an option in the config file for "LEGACY_MUTEX"
suport, and that it be left out until someone actually uses a
recursive mutex.

Both of these steps would ensure that the code is not dropped on
the floor, making it known in no uncertain terms that mutexes of
this type are strongly discouraged.

This fits well with Alfred P's arguments on planning push-down
of locks, instead of merely hacking it.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Wed Sep 27 19:26:33 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from falla.videotron.net (falla.videotron.net [205.151.222.106])
	by hub.freebsd.org (Postfix) with ESMTP id 7C04B37B423
	for <freebsd-arch@freebsd.org>; Wed, 27 Sep 2000 19:26:27 -0700 (PDT)
Received: from modemcable136.203-201-24.mtl.mc.videotron.ca ([24.201.203.136])
 by falla.videotron.net (Sun Internet Mail Server sims.3.5.1999.12.14.10.29.p8)
 with ESMTP id <0G1K00020S3Z9K@falla.videotron.net> for freebsd-arch@freebsd.org; Wed, 27 Sep 2000 22:26:23 -0400 (EDT)
Date: Wed, 27 Sep 2000 22:30:11 -0400 (EDT)
From: Bosko Milekic <bmilekic@technokratis.com>
Subject: spinlocks and acquire pseudo-priority
To: freebsd-arch@freebsd.org
Message-id: <Pine.BSF.4.21.0009272218510.3246-100000@jehovah.technokratis.com>
MIME-version: 1.0
Content-type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


  	I cannot quantify how likely the following is... but logically, it
  should be more probable when there are more CPUs (at LEAST 3).
  
 	Say a thread on processor 1 (A) grabs mutex Y, which happens to be a
  spin-only type mutex.

  	Say thread on processor 2 (B) attempts to grab mutex Y, but fails and
  starts spinnnig in mtx_enter_hard().

  	Now say thread on processor 3 (C) attempts to grab mutex Y and makes
  it to mtx_enter() -- at this very instant before C is about to try it the
  "easy way" and do its cmpxchgl, A releases mutex Y. Now B is still
  spinning; in fact, B is in mtx_enter_hard in the while() loop, it had
  just checked whether the lock was still owned, and it was, so it's just
  iterating again and incrementing the loop index variable. Before B goes
  to the top of the loop and hits the comparison statement again (to see
  whether Y is still owned), C does its cmpxchgl and grabs the lock easily,
  without any issues whatsoever. B continues to spin and eventually the
  loop index reaches the "tolerated" values and there's a panic().

  	Please also note that even if B hits the top of the while loop and
  decides that the mutex is no longer owned, so it hits the top of the
  infinite loop and tries to grab it again, just before it grabs it, it
  could already be had by C. This isn't TOO much of a problem, because the
  probability is low, but grows with the number of processors. The problem
  I see is that the index i is never reset to zero and may eventually hit
  the tolerated values and trigger a panic.

	Is there something I'm leaving out/forgetting?

  Thanks,
  Bosko Milekic
  bmilekic@technokratis.com


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Wed Sep 27 23: 5:52 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20])
	by hub.freebsd.org (Postfix) with ESMTP id B744E37B42C
	for <freebsd-arch@FreeBSD.ORG>; Wed, 27 Sep 2000 23:05:40 -0700 (PDT)
Received: (from bright@localhost)
	by fw.wintelcom.net (8.10.0/8.10.0) id e8S65dN15596;
	Wed, 27 Sep 2000 23:05:39 -0700 (PDT)
Date: Wed, 27 Sep 2000 23:05:39 -0700
From: Alfred Perlstein <bright@wintelcom.net>
To: Bosko Milekic <bmilekic@technokratis.com>
Cc: freebsd-arch@FreeBSD.ORG
Subject: Re: spinlocks and acquire pseudo-priority
Message-ID: <20000927230538.I7553@fw.wintelcom.net>
References: <Pine.BSF.4.21.0009272218510.3246-100000@jehovah.technokratis.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.4i
In-Reply-To: <Pine.BSF.4.21.0009272218510.3246-100000@jehovah.technokratis.com>; from bmilekic@technokratis.com on Wed, Sep 27, 2000 at 10:30:11PM -0400
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

* Bosko Milekic <bmilekic@technokratis.com> [000927 19:26] wrote:
> 
>   	I cannot quantify how likely the following is... but logically, it
>   should be more probable when there are more CPUs (at LEAST 3).
>   
>  	Say a thread on processor 1 (A) grabs mutex Y, which happens to be a
>   spin-only type mutex.
> 
>   	Say thread on processor 2 (B) attempts to grab mutex Y, but fails and
>   starts spinnnig in mtx_enter_hard().
> 
>   	Now say thread on processor 3 (C) attempts to grab mutex Y and makes
>   it to mtx_enter() -- at this very instant before C is about to try it the
>   "easy way" and do its cmpxchgl, A releases mutex Y. Now B is still
>   spinning; in fact, B is in mtx_enter_hard in the while() loop, it had
>   just checked whether the lock was still owned, and it was, so it's just
>   iterating again and incrementing the loop index variable. Before B goes
>   to the top of the loop and hits the comparison statement again (to see
>   whether Y is still owned), C does its cmpxchgl and grabs the lock easily,
>   without any issues whatsoever. B continues to spin and eventually the
>   loop index reaches the "tolerated" values and there's a panic().
> 
>   	Please also note that even if B hits the top of the while loop and
>   decides that the mutex is no longer owned, so it hits the top of the
>   infinite loop and tries to grab it again, just before it grabs it, it
>   could already be had by C. This isn't TOO much of a problem, because the
>   probability is low, but grows with the number of processors. The problem
>   I see is that the index i is never reset to zero and may eventually hit
>   the tolerated values and trigger a panic.
> 
> 	Is there something I'm leaving out/forgetting?

It seems like a possibility, however a spinlock being that contested is
most likely a problem and needs to be fixed.

It may be a good idea to examine the lock right before panicing to
see if the lock state has changed.

It may also be a good idea to alternate between a hard spin and a
DELAY loop rather then backoff so much.

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Thu Sep 28  1:29:49 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from smtp03.primenet.com (smtp03.primenet.com [206.165.6.133])
	by hub.freebsd.org (Postfix) with ESMTP id 1C04937B422
	for <freebsd-arch@FreeBSD.ORG>; Thu, 28 Sep 2000 01:29:47 -0700 (PDT)
Received: (from daemon@localhost)
	by smtp03.primenet.com (8.9.3/8.9.3) id BAA16382;
	Thu, 28 Sep 2000 01:28:20 -0700 (MST)
Received: from usr02.primenet.com(206.165.6.202)
 via SMTP by smtp03.primenet.com, id smtpdAAAa7aO9F; Thu Sep 28 01:28:14 2000
Received: (from tlambert@localhost)
	by usr02.primenet.com (8.8.5/8.8.5) id BAA11680;
	Thu, 28 Sep 2000 01:29:39 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <200009280829.BAA11680@usr02.primenet.com>
Subject: Re: spinlocks and acquire pseudo-priority
To: bmilekic@technokratis.com (Bosko Milekic)
Date: Thu, 28 Sep 2000 08:29:39 +0000 (GMT)
Cc: freebsd-arch@FreeBSD.ORG
In-Reply-To: <Pine.BSF.4.21.0009272218510.3246-100000@jehovah.technokratis.com> from "Bosko Milekic" at Sep 27, 2000 10:30:11 PM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

>   B continues to spin and eventually the
>   loop index reaches the "tolerated" values and there's a panic().
> 
>   	Please also note that even if B hits the top of the while loop and
>   decides that the mutex is no longer owned, so it hits the top of the
>   infinite loop and tries to grab it again, just before it grabs it, it
>   could already be had by C. This isn't TOO much of a problem, because the
>   probability is low, but grows with the number of processors. The problem
>   I see is that the index i is never reset to zero and may eventually hit
>   the tolerated values and trigger a panic.
> 
> 	Is there something I'm leaving out/forgetting?

You are talking about non-deadlock starvation here.

The simple answer is "use "for(;;)" instead of something with a
loop index".

The fact is, there is just as much probability of C losing a
race with B for a contended resource formerly held by A under
normal circumstances, as there is for it losing because of the
conditions which you describe.

The answer is: it doesn't matter -- you only ever use a spinlock
to do one of two things:

1)	Eat the overhead of a heavyweight non-spinning lock

2)	Contend a resource which will be available in a small
	amount of time anyway, so it doesn't matter whether
	you get it first or second

If you really cared about FIFO, FILO, or prioritization or some
other policy based ordering on lock acquisition, you would not
use spinlocks; you would use turnstiles and "wake one", or you
would use some other policy cognizant mechanism for doing the
granting.

FWIW, this means you wouldn't use a mutex, either.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Thu Sep 28  5:42:35 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from falla.videotron.net (falla.videotron.net [205.151.222.106])
	by hub.freebsd.org (Postfix) with ESMTP id 25E4037B422
	for <freebsd-arch@FreeBSD.ORG>; Thu, 28 Sep 2000 05:42:29 -0700 (PDT)
Received: from modemcable136.203-201-24.mtl.mc.videotron.ca ([24.201.203.136])
 by falla.videotron.net (Sun Internet Mail Server sims.3.5.1999.12.14.10.29.p8)
 with ESMTP id <0G1L00H1GKMQPP@falla.videotron.net> for freebsd-arch@FreeBSD.ORG; Thu, 28 Sep 2000 08:42:26 -0400 (EDT)
Date: Thu, 28 Sep 2000 08:46:15 -0400 (EDT)
From: Bosko Milekic <bmilekic@technokratis.com>
Subject: Re: spinlocks and acquire pseudo-priority
In-reply-to: <20000927230538.I7553@fw.wintelcom.net>
To: Alfred Perlstein <bright@wintelcom.net>
Cc: freebsd-arch@FreeBSD.ORG
Message-id: <Pine.BSF.4.21.0009280844130.3999-100000@jehovah.technokratis.com>
MIME-version: 1.0
Content-type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


On Wed, 27 Sep 2000, Alfred Perlstein wrote:

> It seems like a possibility, however a spinlock being that contested is
> most likely a problem and needs to be fixed.

	Not necessarily. It may occur in a big resource starvation where
  many threads just end up in msleep(), or similar, and many others call
  wakeup().

> It may be a good idea to examine the lock right before panicing to
> see if the lock state has changed.

	Yeah, I agree, but it may still happen.... although you make it lesss
  likely by doing that.

> It may also be a good idea to alternate between a hard spin and a
> DELAY loop rather then backoff so much.
> 
> -- 
> -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
> "I have the heart of a child; I keep it in a jar on my desk."

  Bosko Milekic
  bmilekic@technokratis.com


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Thu Sep 28 11: 7:29 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20])
	by hub.freebsd.org (Postfix) with ESMTP id 0D7C837B622
	for <arch@freebsd.org>; Thu, 28 Sep 2000 11:06:38 -0700 (PDT)
Received: (from bright@localhost)
	by fw.wintelcom.net (8.10.0/8.10.0) id e8SI6bI02119
	for arch@freebsd.org; Thu, 28 Sep 2000 11:06:37 -0700 (PDT)
Date: Thu, 28 Sep 2000 11:06:37 -0700
From: Alfred Perlstein <bright@wintelcom.net>
To: arch@freebsd.org
Subject: we need atomic_t
Message-ID: <20000928110637.U7553@fw.wintelcom.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.4i
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

Linux has a datatype called "atomic_t", very useful for refcounts
and struct counters like tcpstat.  My impression is that it's the
largest type an arch can support atomic ops on without weird
gyrations and/or extremely expensive operations.

Example: atomic_t is 32bit on i386, and I think 24 on sparc32.

This would replace our atomic_op_type with just atomic_op and make
code easier to read and get right.  Linux also has the ability to
do a atomic_dec_and_test() which returns whether the operation
decremented the atomic_t down to 0 or not very useful for making
sure _you_ were the one that made the refcount == 0 so that you
can free it.

I'm already seeing a pretty good examples of where this can be
applied:

1) struct ucred->cr_ref
2) struct uidinfo->ui_ref
3) tcpstats
4) other stats :)
5) mbuf external ref counts

I don't have the gcc-assembler-foo to do this optimally without
directly copying from Linux which isn't acceptable.

Can anyone snap this up?   I'd really appreciate it.

thanks,
-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Thu Sep 28 11:15: 9 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from mass.osd.bsdi.com (adsl-63-202-176-106.dsl.snfc21.pacbell.net [63.202.176.106])
	by hub.freebsd.org (Postfix) with ESMTP id 232D037B423
	for <arch@freebsd.org>; Thu, 28 Sep 2000 11:14:57 -0700 (PDT)
Received: from mass.osd.bsdi.com (localhost [127.0.0.1])
	by mass.osd.bsdi.com (8.11.0/8.9.3) with ESMTP id e8SIGLA01632;
	Thu, 28 Sep 2000 11:16:21 -0700 (PDT)
	(envelope-from msmith@mass.osd.bsdi.com)
Message-Id: <200009281816.e8SIGLA01632@mass.osd.bsdi.com>
X-Mailer: exmh version 2.1.1 10/15/1999
To: Alfred Perlstein <bright@wintelcom.net>
Cc: arch@freebsd.org
Subject: Re: we need atomic_t 
In-reply-to: Your message of "Thu, 28 Sep 2000 11:06:37 PDT."
             <20000928110637.U7553@fw.wintelcom.net> 
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Thu, 28 Sep 2000 11:16:21 -0700
From: Mike Smith <msmith@freebsd.org>
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> Linux has a datatype called "atomic_t", very useful for refcounts
> and struct counters like tcpstat.  My impression is that it's the
> largest type an arch can support atomic ops on without weird
> gyrations and/or extremely expensive operations.

sig_atomic_t.

> This would replace our atomic_op_type with just atomic_op and make
> code easier to read and get right.

I strongly disagree.  The atomic_<op>_<type> interface is useful and 
necessary and should remain.  I don't agree that this would make anything 
easier.  In particular, the explicit use of the atomic_* operations makes 
the atomicity constraints very clear.

> I'm already seeing a pretty good examples of where this can be
> applied:
> 
> 1) struct ucred->cr_ref
> 2) struct uidinfo->ui_ref
> 3) tcpstats
> 4) other stats :)
> 5) mbuf external ref counts
> 
> I don't have the gcc-assembler-foo to do this optimally without
> directly copying from Linux which isn't acceptable.

In most cases, you're manipulating the reference count under a mutex 
(since there's no other way to avoid the race where someone else frees 
your structure while you're in the process of dereferencing it), so this 
is largely unnecessary.

> Can anyone snap this up?   I'd really appreciate it.

Hold it right there, sunshine.  8)

-- 
... every activity meets with opposition, everyone who acts has his
rivals and unfortunately opponents also.  But not because people want
to be opponents, rather because the tasks and relationships force
people to take different points of view.  [Dr. Fritz Todt]
           V I C T O R Y   N O T   V E N G E A N C E


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Thu Sep 28 11:39:41 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20])
	by hub.freebsd.org (Postfix) with ESMTP
	id E506D37B422; Thu, 28 Sep 2000 11:39:09 -0700 (PDT)
Received: (from bright@localhost)
	by fw.wintelcom.net (8.10.0/8.10.0) id e8SId9D04813;
	Thu, 28 Sep 2000 11:39:09 -0700 (PDT)
Date: Thu, 28 Sep 2000 11:39:09 -0700
From: Alfred Perlstein <bright@wintelcom.net>
To: Mike Smith <msmith@freebsd.org>
Cc: arch@freebsd.org
Subject: Re: we need atomic_t
Message-ID: <20000928113907.V7553@fw.wintelcom.net>
References: <20000928110637.U7553@fw.wintelcom.net> <200009281816.e8SIGLA01632@mass.osd.bsdi.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.4i
In-Reply-To: <200009281816.e8SIGLA01632@mass.osd.bsdi.com>; from msmith@freebsd.org on Thu, Sep 28, 2000 at 11:16:21AM -0700
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

* Mike Smith <msmith@freebsd.org> [000928 11:14] wrote:
> > Linux has a datatype called "atomic_t", very useful for refcounts
> > and struct counters like tcpstat.  My impression is that it's the
> > largest type an arch can support atomic ops on without weird
> > gyrations and/or extremely expensive operations.
> 
> sig_atomic_t.

I'll look at that.

> 
> > This would replace our atomic_op_type with just atomic_op and make
> > code easier to read and get right.
> 
> I strongly disagree.  The atomic_<op>_<type> interface is useful and 
> necessary and should remain.  I don't agree that this would make anything 
> easier.  In particular, the explicit use of the atomic_* operations makes 
> the atomicity constraints very clear.

I really hate it, it slows down my programming, most of these counters
need to be the largest the platform will allow just avoid overflows,
you remeber the struct file refcount problem we had with apache right?

What's the point of having counters and refcounts that easily overflow?

> > I'm already seeing a pretty good examples of where this can be
> > applied:
> > 
> > 1) struct ucred->cr_ref
> > 2) struct uidinfo->ui_ref
> > 3) tcpstats
> > 4) other stats :)
> > 5) mbuf external ref counts
> > 
> > I don't have the gcc-assembler-foo to do this optimally without
> > directly copying from Linux which isn't acceptable.
> 
> In most cases, you're manipulating the reference count under a mutex 
> (since there's no other way to avoid the race where someone else frees 
> your structure while you're in the process of dereferencing it), so this 
> is largely unnecessary.

It's not possible for this to happen, this is why struct ucred,
mbuf and uidinfo lend themselves to mpsafeness pretty easily with
atomic refcounts.

You do need to own the parent structure lock (struct proc/socket/etc)
so that two codepaths can not deref the same pointer at the same
time, but you need to do that anyway.

Basically you need to own a lock on whatever allows access to the
pointer to the ucred/uidinfo/mbuf.

The atomic ops are particularly useful when you have multiple
different structures that point to a single object (ucred and
mbuf particularly)

When you instantiate a ucred, it has a refcount of 1 and you can
be garanteed that no one else if referencing it, right before it's
shallow copied (a point to it is in more than one place) the count
is at 2 which prevents free() from other codepaths, you're expected
to "own" the parent structure of be it a refcount of 1 or a lock
on the parent.

I'd much rather have:

void
crfree(cr)
	struct ucred *cr;
{
	if (atomic_dec_test(&cr->cr_ref, 1) == 0) {
		/*
		 * Some callers of crget(), such as nfs_statfs(),
		 * allocate a temporary credential, but don't
		 * allocate a uidinfo structure.
		 */
		if (cr->cr_uidinfo != NULL)
			uifree(cr->cr_uidinfo);
		FREE((caddr_t)cr, M_CRED);
	}
}

than:

void
crfree(cr)
	struct ucred *cr;
{
	mtx_enter(&cr->cr_mtx, MTX_DEF);
	if (--cr->cr_ref == 0) {
		mtx_exit(&cr->cr_mtx, MTX_DEF);
		/*
		 * Some callers of crget(), such as nfs_statfs(),
		 * allocate a temporary credential, but don't
		 * allocate a uidinfo structure.
		 */
		if (cr->cr_uidinfo != NULL)
			uifree(cr->cr_uidinfo);
		FREE((caddr_t)cr, M_CRED);
	} else {
		mtx_exit(&cr->cr_mtx, MTX_DEF);
	}
}

Note that there's a interesting cascade assertion going on:

	if (atomic_dec_test(&cr->cr_ref, 1) == 0) { 
	/*
	 * then i have exclusive ownership of this ucred
	 * so I can free the uidinfo it references without a lock
	 * because my parent is locked.
	 */

Is there a problem with this scheme?

> > Can anyone snap this up?   I'd really appreciate it.
> 
> Hold it right there, sunshine.  8)

pfft! :)

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Fri Sep 29 19:25:42 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from puck.firepipe.net (mcut-b-167.resnet.purdue.edu [128.211.209.167])
	by hub.freebsd.org (Postfix) with ESMTP id CB9BD37B503
	for <arch@FreeBSD.org>; Fri, 29 Sep 2000 19:25:40 -0700 (PDT)
Received: by puck.firepipe.net (Postfix, from userid 1000)
	id 842591908; Fri, 29 Sep 2000 21:26:36 -0500 (EST)
Date: Fri, 29 Sep 2000 21:26:36 -0500
From: Will Andrews <will@physics.purdue.edu>
To: Hubert Feyrer <hubertf@NetBSD.org>, op-tech@openpackages.org,
	arch@FreeBSD.org
Subject: :C/// regex for make(1)
Message-ID: <20000929212636.Q75085@puck.firepipe.net>
Reply-To: Will Andrews <will@physics.purdue.edu>
Mail-Followup-To: Will Andrews <will@physics.purdue.edu>,
	Hubert Feyrer <hubertf@NetBSD.org>, op-tech@openpackages.org,
	arch@FreeBSD.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
X-Operating-System: FreeBSD 4.1-STABLE i386
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

Hi Hubert & others,

I've reviewed your PR#21605 and have only one objection: the #ifndef
NO_REGEX parts.  I don't see the point in having this (and others have
concurred).  Is there something I missed about when someone may not want
regex support in make(1)?

As soon as I can get -current working on my laptop again I'll test your
changes and do a closer review of the code.  Then commit, pending any
changes.

Thanks for your submission again,
-- 
Will Andrews <will@physics.purdue.edu> - Physics Computer Network wench
The Universal Answer to All Problems - "It has something to do with physics."
	-- Comic on door of Room 240, Physics Building, Purdue University


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sat Sep 30  6:13:18 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from rfhs8012.fh-regensburg.de (rfhs8012.fh-regensburg.de [194.95.108.29])
	by hub.freebsd.org (Postfix) with ESMTP id 81F6637B502
	for <arch@FreeBSD.org>; Sat, 30 Sep 2000 06:13:16 -0700 (PDT)
Received: from rfhpc8320.fh-regensburg.de (feyrer@rfhpc8320 [194.95.108.32])
	by rfhs8012.fh-regensburg.de (8.10.1/8.10.1) with ESMTP id e8UDCP013873;
	Sat, 30 Sep 2000 15:12:26 +0200 (MET DST)
Received: (from feyrer@localhost) by rfhpc8320.fh-regensburg.de (8.9.1/8.8.3) id PAA05519; Sat, 30 Sep 2000 15:15:15 +0200 (MET DST)
Date: Sat, 30 Sep 2000 15:15:15 +0200 (MET DST)
From: Hubert Feyrer <hubert.feyrer@informatik.fh-regensburg.de>
X-Sender: feyrer@rfhpc8320.fh-regensburg.de
To: Will Andrews <will@physics.purdue.edu>
Cc: op-tech@openpackages.org, arch@FreeBSD.org
Subject: Re: :C/// regex for make(1)
In-Reply-To: <20000929212636.Q75085@puck.firepipe.net>
Message-ID: <Pine.GSO.4.21.0009301508060.24129-100000@rfhpc8320.fh-regensburg.de>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

On Fri, 29 Sep 2000, Will Andrews wrote:
> I've reviewed your PR#21605 and have only one objection: the #ifndef
> NO_REGEX parts.  I don't see the point in having this (and others have
> concurred).  Is there something I missed about when someone may not want
> regex support in make(1)?

I've left the #ifdef in as it's in the NetBSD code.  I can only guess that
the reason is to make it easier to bootstrap make(1) on systems that don't
have regexp routines. Seeing that FreeBSD has dropped all the
bootstrapping code, it's probably best to just put in the code
unconditionally. 


> As soon as I can get -current working on my laptop again I'll test your
> changes and do a closer review of the code.  Then commit, pending any
> changes.

OK - maybe let me know when you're done. 
FYI, I've tested this with 4.0-STABLE.


 - Hubert

-- 
Hubert Feyrer <hubert.feyrer@informatik.fh-regensburg.de>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sat Sep 30  6:58:54 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from puck.firepipe.net (mcut-b-167.resnet.purdue.edu [128.211.209.167])
	by hub.freebsd.org (Postfix) with ESMTP id 7472F37B503
	for <arch@FreeBSD.org>; Sat, 30 Sep 2000 06:58:52 -0700 (PDT)
Received: by puck.firepipe.net (Postfix, from userid 1000)
	id 1904E1908; Sat, 30 Sep 2000 08:59:54 -0500 (EST)
Date: Sat, 30 Sep 2000 08:59:54 -0500
From: Will Andrews <will@physics.purdue.edu>
To: Hubert Feyrer <hubert.feyrer@informatik.fh-regensburg.de>
Cc: Will Andrews <will@physics.purdue.edu>, op-tech@openpackages.org,
	arch@FreeBSD.org
Subject: Re: :C/// regex for make(1)
Message-ID: <20000930085954.W75085@puck.firepipe.net>
Reply-To: Will Andrews <will@physics.purdue.edu>
Mail-Followup-To: Will Andrews <will@physics.purdue.edu>,
	Hubert Feyrer <hubert.feyrer@informatik.fh-regensburg.de>,
	op-tech@openpackages.org, arch@FreeBSD.org
References: <20000929212636.Q75085@puck.firepipe.net> <Pine.GSO.4.21.0009301508060.24129-100000@rfhpc8320.fh-regensburg.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <Pine.GSO.4.21.0009301508060.24129-100000@rfhpc8320.fh-regensburg.de>; from hubert.feyrer@informatik.fh-regensburg.de on Sat, Sep 30, 2000 at 03:15:15PM +0200
X-Operating-System: FreeBSD 4.1-STABLE i386
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

On Sat, Sep 30, 2000 at 03:15:15PM +0200, Hubert Feyrer wrote:
> I've left the #ifdef in as it's in the NetBSD code.  I can only guess that
> the reason is to make it easier to bootstrap make(1) on systems that don't
> have regexp routines. Seeing that FreeBSD has dropped all the
> bootstrapping code, it's probably best to just put in the code
> unconditionally. 

Yeah.  I'm still waiting for objections.

> OK - maybe let me know when you're done. 
> FYI, I've tested this with 4.0-STABLE.

Yes, make(1) has for the most part stayed in sync as far as the -stable
& -current branches at a given time are concerned.

Thanks again,

-- 
Will Andrews <will@physics.purdue.edu> - Physics Computer Network wench
The Universal Answer to All Problems - "It has something to do with physics."
	-- Comic on door of Room 240, Physics Building, Purdue University


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message