Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 28 Feb 2007 08:41:13 +0000 (GMT)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        Greg 'groggy' Lehey <grog@FreeBSD.org>
Cc:        cvs-src@FreeBSD.org, src-committers@FreeBSD.org, cvs-all@FreeBSD.org, Kris Kennaway <kris@obsecurity.org>
Subject:   Re: cvs commit: src/share/man/man9 sleep.9
Message-ID:  <20070228081340.U56223@fledge.watson.org>
In-Reply-To: <20070228075755.GL8399@wantadilla.lemis.com>
References:  <200702272309.l1RN9Xum011236@repoman.freebsd.org> <20070227235843.GA59138@xor.obsecurity.org> <20070228064334.GG8399@wantadilla.lemis.com> <20070228070904.GA63343@xor.obsecurity.org> <20070228075755.GL8399@wantadilla.lemis.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On Wed, 28 Feb 2007, Greg 'groggy' Lehey wrote:

>> If there are parts of the FreeBSD kernel that are abusing a sleep channel 
>> to create this situation, we should fix them.
>
> See the rest of the thread.  A "sleep channel" is a memory address. It's 
> usually in the kernel, so you're talking about a 30 bit address space on 
> ia32.  That's really not very many.

Again, this is like saying "You can't expect to find the right data at a 
pointer, as there are only 2^30 possible values and anyone could write to it". 
Anyone could, but when they do, it's considered a bug if it leads the program 
to behave incorrectly -- the pointer mechanism itself is working just fine 
when you write to the wrong pointer, though, as does the sleep/wakeup 
mechanism when you wakeup the wrong pointer.  The sleep/wakeup mechanism 
relies on memory ownership, just like reading from and writing to pointers, in 
order to prevent collisions during use.  Whether you allocate it on the stack, 
allocate it with malloc(9), or are delegated use of an address as part of a 
complex data structure, memory ownership is how you prevent unexpected 
simultaneous use of an address.  If there is unexpected simultaneous use of an 
address by the same or a different consumer, then that is a bug in the 
consumer, and not the producer of the API.

>> If not, the most that should be done in the FreeBSD manpage is to clearly 
>> explain how not to introduce such a bug in a programmer's own code.
>
> Until the advent of wakeup_one, this wasn't a bug.  wakeup works fine under 
> these circumstances.

wakeup_one works fine, as far as we know, under all circumstances.  The 
mistake is in re-using a memory address for more than one type of wakeup event 
at a time.  Per previous requests for correction, we can and should document 
the semantics that are provided and the pitfalls programmers fall into when 
not aware of those semantics, but let's not document those semantics as 
incorrect.

>> As far as I'm aware, nowhere else in our manpages do we provide advice for 
>> the lazy programmer who cannot be bothered figuring out whether his code is 
>> correct and who just wants an expedient hack in case it's not.
>
> Maybe you should be a little less combative and consider that the paradigms 
> have changed.  The whole idea of sleeping on memory addresses is an 
> expedient hack.  The fact that people usually choose different addresses 
> means that even wakeup_one seldom has problems. But most people aren't even 
> aware of the issue.  As I say, how would you address the status quo?

It's not an accident that people don't choose the same address arbitrarily, 
it's a design feature.  We don't just randomly start using random memory 
addresses for storage, and we also don't just start using random memory 
addreses for wakeup channels.  When a programmer creates a new wakeup event 
class, it is their responsibility to decide whether it is appropriate to reuse 
an exist memory address or structure field for it.  They do this by looking at 
whether the existing address/field semantics with respect to wakeup are 
identical or not.  If not, they add a new address.  This is why some data 
structures, such as sockets, have several variables used for wakeups: the 
semantics of the wakeups differ, and so the same one cannot be used. 
Likewise, changing arbitrary instances of wakeup() to wakeup_one(), or vice 
versa, should be done only after analyzing other users of the same channel to 
make sure that the behavior is compatible.

You'll notice that a few years ago, we added cv(9) to the repertoire of kernel 
synchronization primitives.  This primitive represents a semantic improvement 
on the sleep(9) primitives for several reasons:

(1) It ties the notion of of a sleep/wakeup channel to a specific initialized
     memory type in order to encourage correct use as part of an init / sleep /
     wakeup / destroy life cycle.

(2) It enforces use of a mutex with the condition variable, which in turn
     discourages incorrect use without a mutex.  I can't remember if we KASSERT
     that it's the same mutex every time, but we should. :-)

The main downside I'm aware of is that this involves allocating slightly more 
memory explicitly, but as you point out, this may well discourage bugs in 
consumers.  What you could do is replace the text you've added with a coherent 
description of the memory ownership model (per previous e-mails), and add a 
comment that the cv(9) interface is now preferred.  Mind you, exactly the same 
consumer bugs can occur, as it supports both cv_signal (wakeup_one) and 
cv_broadcast (wakeup) semantics, so as with the sleep(9) API, has to be done 
with the awareness that naive use can lead to unexpected results.

Robert N M Watson
Computer Laboratory
University of Cambridge



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070228081340.U56223>