From owner-freebsd-hackers@FreeBSD.ORG  Wed Sep 30 13:07:28 2009
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A41DC106568D;
	Wed, 30 Sep 2009 13:07:28 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 629408FC29;
	Wed, 30 Sep 2009 13:07:28 +0000 (UTC)
Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net
	[66.111.2.69])
	by cyrus.watson.org (Postfix) with ESMTPSA id F170746B0C;
	Wed, 30 Sep 2009 09:07:27 -0400 (EDT)
Received: from jhbbsd.hudson-trading.com (unknown [209.249.190.8])
	by bigwig.baldwin.cx (Postfix) with ESMTPA id E8E848A01B;
	Wed, 30 Sep 2009 09:07:26 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: Attilio Rao <attilio@freebsd.org>
Date: Wed, 30 Sep 2009 07:59:28 -0400
User-Agent: KMail/1.9.7
References: <20090924224935.GW473@gandalf.sssup.it>
	<200909291731.32394.jhb@freebsd.org>
	<3bbf2fe10909291439x21f53e34n60d63554b1dea0de@mail.gmail.com>
In-Reply-To: <3bbf2fe10909291439x21f53e34n60d63554b1dea0de@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="utf-8"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200909300759.29141.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1
	(bigwig.baldwin.cx); Wed, 30 Sep 2009 09:07:26 -0400 (EDT)
X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx
X-Virus-Status: Clean
X-Spam-Status: No, score=-2.5 required=4.2 tests=AWL,BAYES_00,RDNS_NONE
	autolearn=no version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx
Cc: Ed Schouten <ed@80386.nl>, Max Laier <max@love2party.net>,
	Roman Divacky <rdivacky@freebsd.org>,
	Fabio Checconi <fabio@freebsd.org>, freebsd-hackers@freebsd.org
Subject: Re: sx locks and memory barriers
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 30 Sep 2009 13:07:28 -0000

On Tuesday 29 September 2009 5:39:43 pm Attilio Rao wrote:
> 2009/9/29 John Baldwin <jhb@freebsd.org>:
> > On Tuesday 29 September 2009 4:42:13 pm Attilio Rao wrote:
> >> 2009/9/29 Max Laier <max@love2party.net>:
> >> > On Tuesday 29 September 2009 17:39:37 Attilio Rao wrote:
> >> >> 2009/9/25 Fabio Checconi <fabio@freebsd.org>:
> >> >> > Hi all,
> >> >> >  looking at sys/sx.h I have some troubles understanding this 
comment:
> >> >> >
> >> >> >  * A note about memory barriers.  Exclusive locks need to use the 
same
> >> >> >  * memory barriers as mutexes: _acq when acquiring an exclusive lock
> >> >> >  * and _rel when releasing an exclusive lock.  On the other side,
> >> >> >  * shared lock needs to use an _acq barrier when acquiring the lock
> >> >> >  * but, since they don't update any locked data, no memory barrier 
is
> >> >> >  * needed when releasing a shared lock.
> >> >> >
> >> >> > In particular, I'm not understanding what prevents the following 
sequence
> >> >> > from happening:
> >> >> >
> >> >> > CPU A                                   CPU B
> >> >> >
> >> >> > sx_slock(&data->lock);
> >> >> >
> >> >> > sx_sunlock(&data->lock);
> >> >> >
> >> >> > /* reordered after the unlock
> >> >> >   by the cpu */
> >> >> > if (data->buffer)
> >> >> >                                        sx_xlock(&data->lock);
> >> >> >                                        free(data->buffer);
> >> >> >                                        data->buffer = NULL;
> >> >> >                                        sx_xunlock(&data->lock);
> >> >> >
> >> >> >        a = *data->buffer;
> >> >> >
> >> >> > IOW, even if readers do not modify the data protected by the lock,
> >> >> > without a release barrier a memory access may leak past the unlock 
(as
> >> >> > the cpu won't notice any dependency between the unlock and the 
fetch,
> >> >> > feeling free to reorder them), thus potentially racing with an 
exclusive
> >> >> > writer accessing the data.
> >> >> >
> >> >> > On architectures where atomic ops serialize memory accesses this 
would
> >> >> > never happen, otherwise the sequence above seems possible; am I 
missing
> >> >> > something?
> >> >>
> >> >> I think your concerns are right, possibly we need this patch:
> >> >> http://www.freebsd.org/~attilio/sxrw_unlockb.diff
> >> >>
> >> >> However speaking with John we agreed possibly there is a more serious
> >> >>  breakage. Possibly, memory barriers would also require to ensure the
> >> >>  compiler to not reorder the operation, while right now, in FreeBSD, 
they
> >> >>  just take care of the reordering from the architecture perspective.
> >> >> The only way I'm aware of GCC offers that is to clobber memory.
> >> >> I will provide a patch that address this soon, hoping that GCC will be
> >> >> smart enough to not overhead too much the memory clobbering but just
> >> >> try to understand what's our purpose and servers it (I will try to
> >> >> compare code generated before and after the patch at least for tier-1
> >> >> architectures).
> >> >
> >> > Does GCC really reorder accesses to volatile objects? The C Standard 
seems to
> >> > object:
> >> >
> >> > 5.1.2.3 - 2
> >> > Accessing a volatile object, modifying an object, modifying a file, or 
calling
> >> > a function that does any of those operations are all side effects,11) 
which
> >> > are changes in the state of the execution environment. Evaluation of an
> >> > expression may produce side effects. At certain specified points in the
> >> > execution sequence called sequence points, all side effects of previous
> >> > evaluations shall be complete and no side effects of subsequent 
evaluations
> >> > shall have taken place. (A summary of the sequence points is given in 
annex
> >> > C.)
> >>
> >> Very interesting.
> >> I was thinking about the other operating systems which basically do
> >> 'memory clobbering' for ensuring a compiler barrier, but actually they
> >> often forsee such a barrier without the conjuction of a memory
> >> operand.
> >>
> >> I think I will need to speak a bit with a GCC engineer in order to see
> >> what do they implement in regard of volatile operands.
> >
> > GCC can be quite aggressive with reordering even in the face of volatile.  
I
> > was recently doing a hack to export some data from the kernel to userland
> > that used a spin loop to grab a snapshot of the contents of a structure
> > similar to the method used in the kernel with the timehands structures.  
It
> > used a volatile structure exposed from the kernel that looked something
> > like:
> >
> > struct foo {
> >        volatile int gen;
> >        /* other stuff */
> > };
> >
> > volatile struct foo *p;
> >
> > do {
> >        x = p->gen;
> >        /* read other stuff */
> >        y = p->gen;
> > } while (x != y && x != 0);
> >
> > GCC moved the 'y = ' up into the middle of the '/* read other stuff */'.
> > I eventually had to add explicit "memory" clobbers to force GCC to not
> > move the reads of 'gen' around but do them "around" all the other
> > operations, so that the working code is:
> >
> > do {
> >        x = p->gen;
> >        asm volatile("" ::: "memory");
> >        /* read other stuff */
> >        asm volatile("" ::: "memory");
> >        y = p->gen;
> > } while (x != y && x != 0);
> >
> 
> I see.
> So probabilly clobbering memory is the only choice we have right now.
> I will try to make a patch which also keeps into account the
> possibility to skip it (or define by hand alternative approaches) for
> different compilers.
> I wonder, specifically, how llvm/clang relies with it.

We already allow for different compilers to defined different versions of 
atomic_*().  I think all you need to do for now is just add "memory" clobbers 
to all of the atomic operations that have either an _acq or _rel memory 
barrier.

-- 
John Baldwin