From owner-freebsd-arch@FreeBSD.ORG  Wed Oct 29 17:36:38 2014
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id A683DB13;
 Wed, 29 Oct 2014 17:36:38 +0000 (UTC)
Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1])
 (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 7CADFB18;
 Wed, 29 Oct 2014 17:36:38 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id 6DC03B97F;
 Wed, 29 Oct 2014 13:36:37 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: Ian Lepore <ian@freebsd.org>
Subject: Re: atomic ops
Date: Wed, 29 Oct 2014 13:35:57 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.4-CBSD-20140415; KDE/4.5.5; amd64; ; )
References: <20141028025222.GA19223@dft-labs.eu>
 <201410291059.16829.jhb@freebsd.org>
 <1414601895.17308.89.camel@revolution.hippie.lan>
In-Reply-To: <1414601895.17308.89.camel@revolution.hippie.lan>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201410291335.57919.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Wed, 29 Oct 2014 13:36:37 -0400 (EDT)
Cc: Adrian Chadd <adrian@freebsd.org>, Mateusz Guzik <mjguzik@gmail.com>,
 Alan Cox <alc@rice.edu>, Andrew Turner <andrew@fubar.geek.nz>,
 attilio@freebsd.org, Konstantin Belousov <kib@freebsd.org>,
 freebsd-arch@freebsd.org
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.18-1
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch/>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 29 Oct 2014 17:36:38 -0000

On Wednesday, October 29, 2014 12:58:15 pm Ian Lepore wrote:
> On Wed, 2014-10-29 at 10:59 -0400, John Baldwin wrote:
> > Eh, that isn't broken.  It is subtle however.  The reason it isn't broken
> > is that if any access to P occurs afer the 'load P', then the store will
> > fail and the load-acquire will be retried, if A was accessed during the
> > atomi op, the load-acquire during the try will discard that and force A
> > to be re-accessed.  If P is not accessed during the atomic op, then it is
> > safe to access A during the atomic op itself.
> > 
> 
> I'm not sure I completely agree with all of this. 
> 
> First, for 
> 
>         if any access to P occurs afer the 'load P', then the store will
>         fail and the load-acquire will be retried
> 
> The term 'access' needs to be changed to 'store'.  Other read accesses
> to P will not cause the store-exclusive to fail.

Correct, though for the places where acquire is used I believe that is ok.
Certainly for lock cookies it is ok.  It's writes to the lock cookie that
would invalidate 'A'.

> Next, when we consider 'Access A' I'm not sure it's true that the access
> will replay if the store-exclusive fails and the operation loops.  The
> access to A may have been a prefetch, even a prefetch for data on a
> predicted upcoming execution branch which may or may not end up being
> taken.
> 
> I think the only think that makes an ldrex/strex sequence safe for use
> in implementing synchronization primitives is to insert a 'dmb' after
> the acquire loop (after the strex succeeds), and 'dsb' before the
> release loop (dsb is required for SMP, dmb might be good enough on UP).
> 
> Looking into this has made me realize our current armv6/7 atomics are
> incorrect in this regard.  Guess I'll see about fixing them up Real Soon
> Now.  :)

I'm not actually sure either, but it would be surprising to me otherwise.
Presumably there is nothing magic about a branch.  Either the load-acquire
is an acquire barrier or it isn't.  Namely, suppose you had this sequence:

	load-acquire P
	access A (prefetch)
	load-acquire Q
	load A

Would you expect the prefetch to satisfy the load or should the load-acquire
on Q discard that?  Having a branch after a failing conditional store back
to the load acquire should work similarly.  It has to discard anything that
was prefetched or it isn't an actual load-acquire.

That is consider:

1:
	load-acquire P
	access A (prefetch)
	conditonal-store P
	branch-if-fail 1b
	load A

In the case that the branch fails, the sequence of operations is:

	load-acquire P
	access A (prefetch)
	conditional-store P
	branch
	load-acquire P

That should be equivalent to the first sequence above unless the branch
instruction has the magical property of disabling memory barriers on the
instruction after a branch (which would be insane).

-- 
John Baldwin