From owner-freebsd-arch@FreeBSD.ORG  Wed Oct 29 18:03:54 2014
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id A28CA86F;
 Wed, 29 Oct 2014 18:03:54 +0000 (UTC)
Received: from mho-02-ewr.mailhop.org (mho-02-ewr.mailhop.org [204.13.248.72])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 60E91E9E;
 Wed, 29 Oct 2014 18:03:54 +0000 (UTC)
Received: from [73.34.117.227] (helo=ilsoft.org)
 by mho-02-ewr.mailhop.org with esmtpsa (TLSv1:AES256-SHA:256)
 (Exim 4.72) (envelope-from <ian@FreeBSD.org>)
 id 1XjXb6-000ADo-SF; Wed, 29 Oct 2014 18:03:53 +0000
Received: from [172.22.42.240] (revolution.hippie.lan [172.22.42.240])
 by ilsoft.org (8.14.9/8.14.9) with ESMTP id s9TI3osa081247;
 Wed, 29 Oct 2014 12:03:50 -0600 (MDT) (envelope-from ian@FreeBSD.org)
X-Mail-Handler: Dyn Standard SMTP by Dyn
X-Originating-IP: 73.34.117.227
X-Report-Abuse-To: abuse@dyndns.com (see
 http://www.dyndns.com/services/sendlabs/outbound_abuse.html for abuse
 reporting information)
X-MHO-User: U2FsdGVkX1+2a/JXp6EOyczZ1i5aJJC0
X-Authentication-Warning: paranoia.hippie.lan: Host revolution.hippie.lan
 [172.22.42.240] claimed to be [172.22.42.240]
Subject: Re: atomic ops
From: Ian Lepore <ian@FreeBSD.org>
To: John Baldwin <jhb@freebsd.org>
In-Reply-To: <201410291335.57919.jhb@freebsd.org>
References: <20141028025222.GA19223@dft-labs.eu>
 <201410291059.16829.jhb@freebsd.org>
 <1414601895.17308.89.camel@revolution.hippie.lan>
 <201410291335.57919.jhb@freebsd.org>
Content-Type: text/plain; charset="us-ascii"
Date: Wed, 29 Oct 2014 12:03:50 -0600
Message-ID: <1414605830.17308.100.camel@revolution.hippie.lan>
Mime-Version: 1.0
X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port 
Content-Transfer-Encoding: 7bit
Cc: Adrian Chadd <adrian@freebsd.org>, Mateusz Guzik <mjguzik@gmail.com>,
 Alan Cox <alc@rice.edu>, Andrew Turner <andrew@fubar.geek.nz>,
 attilio@freebsd.org, Konstantin Belousov <kib@freebsd.org>,
 freebsd-arch@freebsd.org
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.18-1
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch/>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 29 Oct 2014 18:03:54 -0000

On Wed, 2014-10-29 at 13:35 -0400, John Baldwin wrote:
> On Wednesday, October 29, 2014 12:58:15 pm Ian Lepore wrote:
> > On Wed, 2014-10-29 at 10:59 -0400, John Baldwin wrote:
> > > Eh, that isn't broken.  It is subtle however.  The reason it isn't broken
> > > is that if any access to P occurs afer the 'load P', then the store will
> > > fail and the load-acquire will be retried, if A was accessed during the
> > > atomi op, the load-acquire during the try will discard that and force A
> > > to be re-accessed.  If P is not accessed during the atomic op, then it is
> > > safe to access A during the atomic op itself.
> > > 
> > 
> > I'm not sure I completely agree with all of this. 
> > 
> > First, for 
> > 
> >         if any access to P occurs afer the 'load P', then the store will
> >         fail and the load-acquire will be retried
> > 
> > The term 'access' needs to be changed to 'store'.  Other read accesses
> > to P will not cause the store-exclusive to fail.
> 
> Correct, though for the places where acquire is used I believe that is ok.
> Certainly for lock cookies it is ok.  It's writes to the lock cookie that
> would invalidate 'A'.
> 
> > Next, when we consider 'Access A' I'm not sure it's true that the access
> > will replay if the store-exclusive fails and the operation loops.  The
> > access to A may have been a prefetch, even a prefetch for data on a
> > predicted upcoming execution branch which may or may not end up being
> > taken.
> > 
> > I think the only think that makes an ldrex/strex sequence safe for use
> > in implementing synchronization primitives is to insert a 'dmb' after
> > the acquire loop (after the strex succeeds), and 'dsb' before the
> > release loop (dsb is required for SMP, dmb might be good enough on UP).
> > 
> > Looking into this has made me realize our current armv6/7 atomics are
> > incorrect in this regard.  Guess I'll see about fixing them up Real Soon
> > Now.  :)
> 
> I'm not actually sure either, but it would be surprising to me otherwise.
> Presumably there is nothing magic about a branch.  Either the load-acquire
> is an acquire barrier or it isn't.  Namely, suppose you had this sequence:
> 
> 	load-acquire P
> 	access A (prefetch)
> 	load-acquire Q
> 	load A
> 
> Would you expect the prefetch to satisfy the load or should the load-acquire
> on Q discard that?  Having a branch after a failing conditional store back
> to the load acquire should work similarly.  It has to discard anything that
> was prefetched or it isn't an actual load-acquire.
> 
> That is consider:
> 
> 1:
> 	load-acquire P
> 	access A (prefetch)
> 	conditonal-store P
> 	branch-if-fail 1b
> 	load A
> 
> In the case that the branch fails, the sequence of operations is:
> 
> 	load-acquire P
> 	access A (prefetch)
> 	conditional-store P
> 	branch
> 	load-acquire P
> 
> That should be equivalent to the first sequence above unless the branch
> instruction has the magical property of disabling memory barriers on the
> instruction after a branch (which would be insane).
> 

I hadn't realized it when I wrote that, but Andy was speaking in the
context of armv8, which has a true load-acquire instruction.  In our
current code (armv6 and 7) we need the explicit dmb/dsb barriers to get
the same effect.  (It turns out we do have barriers, I misspoke earlier,
but some of our dmb need to be dsb.)

-- Ian