From owner-cvs-all@FreeBSD.ORG Wed Feb 28 08:41:14 2007 Return-Path: X-Original-To: cvs-all@FreeBSD.org Delivered-To: cvs-all@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8E3BC16A400; Wed, 28 Feb 2007 08:41:14 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 3EF7213C491; Wed, 28 Feb 2007 08:41:14 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id D5F8C490C4; Wed, 28 Feb 2007 03:41:13 -0500 (EST) Date: Wed, 28 Feb 2007 08:41:13 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Greg 'groggy' Lehey In-Reply-To: <20070228075755.GL8399@wantadilla.lemis.com> Message-ID: <20070228081340.U56223@fledge.watson.org> References: <200702272309.l1RN9Xum011236@repoman.freebsd.org> <20070227235843.GA59138@xor.obsecurity.org> <20070228064334.GG8399@wantadilla.lemis.com> <20070228070904.GA63343@xor.obsecurity.org> <20070228075755.GL8399@wantadilla.lemis.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: cvs-src@FreeBSD.org, src-committers@FreeBSD.org, cvs-all@FreeBSD.org, Kris Kennaway Subject: Re: cvs commit: src/share/man/man9 sleep.9 X-BeenThere: cvs-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the entire tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Feb 2007 08:41:14 -0000 On Wed, 28 Feb 2007, Greg 'groggy' Lehey wrote: >> If there are parts of the FreeBSD kernel that are abusing a sleep channel >> to create this situation, we should fix them. > > See the rest of the thread. A "sleep channel" is a memory address. It's > usually in the kernel, so you're talking about a 30 bit address space on > ia32. That's really not very many. Again, this is like saying "You can't expect to find the right data at a pointer, as there are only 2^30 possible values and anyone could write to it". Anyone could, but when they do, it's considered a bug if it leads the program to behave incorrectly -- the pointer mechanism itself is working just fine when you write to the wrong pointer, though, as does the sleep/wakeup mechanism when you wakeup the wrong pointer. The sleep/wakeup mechanism relies on memory ownership, just like reading from and writing to pointers, in order to prevent collisions during use. Whether you allocate it on the stack, allocate it with malloc(9), or are delegated use of an address as part of a complex data structure, memory ownership is how you prevent unexpected simultaneous use of an address. If there is unexpected simultaneous use of an address by the same or a different consumer, then that is a bug in the consumer, and not the producer of the API. >> If not, the most that should be done in the FreeBSD manpage is to clearly >> explain how not to introduce such a bug in a programmer's own code. > > Until the advent of wakeup_one, this wasn't a bug. wakeup works fine under > these circumstances. wakeup_one works fine, as far as we know, under all circumstances. The mistake is in re-using a memory address for more than one type of wakeup event at a time. Per previous requests for correction, we can and should document the semantics that are provided and the pitfalls programmers fall into when not aware of those semantics, but let's not document those semantics as incorrect. >> As far as I'm aware, nowhere else in our manpages do we provide advice for >> the lazy programmer who cannot be bothered figuring out whether his code is >> correct and who just wants an expedient hack in case it's not. > > Maybe you should be a little less combative and consider that the paradigms > have changed. The whole idea of sleeping on memory addresses is an > expedient hack. The fact that people usually choose different addresses > means that even wakeup_one seldom has problems. But most people aren't even > aware of the issue. As I say, how would you address the status quo? It's not an accident that people don't choose the same address arbitrarily, it's a design feature. We don't just randomly start using random memory addresses for storage, and we also don't just start using random memory addreses for wakeup channels. When a programmer creates a new wakeup event class, it is their responsibility to decide whether it is appropriate to reuse an exist memory address or structure field for it. They do this by looking at whether the existing address/field semantics with respect to wakeup are identical or not. If not, they add a new address. This is why some data structures, such as sockets, have several variables used for wakeups: the semantics of the wakeups differ, and so the same one cannot be used. Likewise, changing arbitrary instances of wakeup() to wakeup_one(), or vice versa, should be done only after analyzing other users of the same channel to make sure that the behavior is compatible. You'll notice that a few years ago, we added cv(9) to the repertoire of kernel synchronization primitives. This primitive represents a semantic improvement on the sleep(9) primitives for several reasons: (1) It ties the notion of of a sleep/wakeup channel to a specific initialized memory type in order to encourage correct use as part of an init / sleep / wakeup / destroy life cycle. (2) It enforces use of a mutex with the condition variable, which in turn discourages incorrect use without a mutex. I can't remember if we KASSERT that it's the same mutex every time, but we should. :-) The main downside I'm aware of is that this involves allocating slightly more memory explicitly, but as you point out, this may well discourage bugs in consumers. What you could do is replace the text you've added with a coherent description of the memory ownership model (per previous e-mails), and add a comment that the cv(9) interface is now preferred. Mind you, exactly the same consumer bugs can occur, as it supports both cv_signal (wakeup_one) and cv_broadcast (wakeup) semantics, so as with the sleep(9) API, has to be done with the awareness that naive use can lead to unexpected results. Robert N M Watson Computer Laboratory University of Cambridge