From owner-freebsd-arch@FreeBSD.ORG Wed Oct 29 17:36:38 2014 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A683DB13; Wed, 29 Oct 2014 17:36:38 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7CADFB18; Wed, 29 Oct 2014 17:36:38 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 6DC03B97F; Wed, 29 Oct 2014 13:36:37 -0400 (EDT) From: John Baldwin To: Ian Lepore Subject: Re: atomic ops Date: Wed, 29 Oct 2014 13:35:57 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.4-CBSD-20140415; KDE/4.5.5; amd64; ; ) References: <20141028025222.GA19223@dft-labs.eu> <201410291059.16829.jhb@freebsd.org> <1414601895.17308.89.camel@revolution.hippie.lan> In-Reply-To: <1414601895.17308.89.camel@revolution.hippie.lan> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201410291335.57919.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 29 Oct 2014 13:36:37 -0400 (EDT) Cc: Adrian Chadd , Mateusz Guzik , Alan Cox , Andrew Turner , attilio@freebsd.org, Konstantin Belousov , freebsd-arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Oct 2014 17:36:38 -0000 On Wednesday, October 29, 2014 12:58:15 pm Ian Lepore wrote: > On Wed, 2014-10-29 at 10:59 -0400, John Baldwin wrote: > > Eh, that isn't broken. It is subtle however. The reason it isn't broken > > is that if any access to P occurs afer the 'load P', then the store will > > fail and the load-acquire will be retried, if A was accessed during the > > atomi op, the load-acquire during the try will discard that and force A > > to be re-accessed. If P is not accessed during the atomic op, then it is > > safe to access A during the atomic op itself. > > > > I'm not sure I completely agree with all of this. > > First, for > > if any access to P occurs afer the 'load P', then the store will > fail and the load-acquire will be retried > > The term 'access' needs to be changed to 'store'. Other read accesses > to P will not cause the store-exclusive to fail. Correct, though for the places where acquire is used I believe that is ok. Certainly for lock cookies it is ok. It's writes to the lock cookie that would invalidate 'A'. > Next, when we consider 'Access A' I'm not sure it's true that the access > will replay if the store-exclusive fails and the operation loops. The > access to A may have been a prefetch, even a prefetch for data on a > predicted upcoming execution branch which may or may not end up being > taken. > > I think the only think that makes an ldrex/strex sequence safe for use > in implementing synchronization primitives is to insert a 'dmb' after > the acquire loop (after the strex succeeds), and 'dsb' before the > release loop (dsb is required for SMP, dmb might be good enough on UP). > > Looking into this has made me realize our current armv6/7 atomics are > incorrect in this regard. Guess I'll see about fixing them up Real Soon > Now. :) I'm not actually sure either, but it would be surprising to me otherwise. Presumably there is nothing magic about a branch. Either the load-acquire is an acquire barrier or it isn't. Namely, suppose you had this sequence: load-acquire P access A (prefetch) load-acquire Q load A Would you expect the prefetch to satisfy the load or should the load-acquire on Q discard that? Having a branch after a failing conditional store back to the load acquire should work similarly. It has to discard anything that was prefetched or it isn't an actual load-acquire. That is consider: 1: load-acquire P access A (prefetch) conditonal-store P branch-if-fail 1b load A In the case that the branch fails, the sequence of operations is: load-acquire P access A (prefetch) conditional-store P branch load-acquire P That should be equivalent to the first sequence above unless the branch instruction has the magical property of disabling memory barriers on the instruction after a branch (which would be insane). -- John Baldwin