From owner-freebsd-smp@FreeBSD.ORG Wed Sep 17 14:35:05 2003 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 870B516A4B3; Wed, 17 Sep 2003 14:35:05 -0700 (PDT) Received: from cs.rice.edu (cs.rice.edu [128.42.1.30]) by mx1.FreeBSD.org (Postfix) with ESMTP id 89E4643F3F; Wed, 17 Sep 2003 14:35:04 -0700 (PDT) (envelope-from alc@cs.rice.edu) Received: from localhost (localhost [127.0.0.1]) by localhost.cs.rice.edu (Postfix) with ESMTP id 194114A9EF; Wed, 17 Sep 2003 16:35:04 -0500 (CDT) Received: from cs.rice.edu ([127.0.0.1]) by localhost (cs.rice.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 05071-07; Wed, 17 Sep 2003 16:35:02 -0500 (CDT) Received: by cs.rice.edu (Postfix, from userid 19572) id 810914A9DD; Wed, 17 Sep 2003 16:35:02 -0500 (CDT) From: Alan Cox To: John Baldwin Message-ID: <20030917213502.GN12711@cs.rice.edu> References: <20030917182237.GM12711@cs.rice.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.28i X-Virus-Scanned: by amavis-20030314-p2 at cs.rice.edu cc: smp@freebsd.org Subject: Re: atomicity of unlocked reads X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Wed, 17 Sep 2003 21:35:05 -0000 X-Original-Date: Wed, 17 Sep 2003 16:35:02 -0500 X-List-Received-Date: Wed, 17 Sep 2003 21:35:05 -0000 On Wed, Sep 17, 2003 at 03:34:53PM -0400, John Baldwin wrote: > > No, that is not what an acquire load is for. Memory barriers only > affect the order of memory operations on the current processor, they > have no bearing on other processors and don't provide any direct > synchronization with other processors. Instead, if I have an acquire > barrier, then the processor is not allowed to re-order any later reads > or writes before the marked read. Note that earlier reads or writes > can be re-ordered after the marked read. A release barrier is the > opposite in that prior reads/writes must be completed prior to the > marked write, but later reads/writes may be re-ordered before the > marked write. > Umm, this is not the way that the computer architecture community defines acquire and release accesses. They do, in fact, play a role in defining the (partial) order in which memory accesses are seen by the different processors within a system. Specifically, I would refer you to Condition 3.1 in Gharachorloo et al. ("Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors" at http://citeseer.nj.nec.com/gharachorloo90memory.html). Note particularly, the phrasing "... perform with respect to any other processor." This was the paper that introduced the Release Consistency model, and the notion of acquire and release accesses. The rest of what you say about the effects of acquire and release accesses on re-ordering within a processor is basically correct. > Bruce explicitly said that if he reads a stale value, that is ok, > so the membars don't do anything for him. If he is worried about > stale data, then he needs a lock, not just atomic operations. Not necessarily. My previous message addressed this point. > ... The > way that a lock works is that when we try to acquire a lock, we > use an acquire barrier. This means that later reads/writes in the > instruction stream won't be re-ordered before the lock acquire. > When we release the lock we use a release barrier to ensure that > any modifications made while holding the lock will be visible > before the write to release the lock is visible. Thus, you can > have CPU A acquire lock L, make a few writes, and then release > lock L. If CPU B tries to acquire lock L after A has released > it but before the write releasing the lock is visible to B, B will > end up spinning (see the MTX_CONTESTED flag in the mutex code) > until that write is visible (unless another thread has already > blocked on this lock, in which case B will just block right away) > until the write releasing L is visible to B. B can then acquire > lock L. Since it had to wait for L's release write to be visible, > this means that all the writes A performed are now visible to B, > and thus B will not read stale data. > > Thus, memory barriers don't actually enforce any synchronization, > they just give you a tool that can be used in conjunction with a > memory location to construct a lock primitive that enforces > sychronization. I agree with this statement. Atomicity, synchronization, and memory ordering are three distinct concepts. A few years back a former colleague of mine, Sarita Adve, and Kourosh Gharachorloo wrote a survey paper for IEEE Computer on this topic. See http://citeseer.nj.nec.com/adve95shared.html. (And, yes, I'm the Cox who appears in the "Related documents from co-citation" section partway down that web page. I used to work in a related area.) Alan