From owner-freebsd-hackers@FreeBSD.ORG Wed Sep 30 13:07:28 2009 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A41DC106568D; Wed, 30 Sep 2009 13:07:28 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 629408FC29; Wed, 30 Sep 2009 13:07:28 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id F170746B0C; Wed, 30 Sep 2009 09:07:27 -0400 (EDT) Received: from jhbbsd.hudson-trading.com (unknown [209.249.190.8]) by bigwig.baldwin.cx (Postfix) with ESMTPA id E8E848A01B; Wed, 30 Sep 2009 09:07:26 -0400 (EDT) From: John Baldwin To: Attilio Rao Date: Wed, 30 Sep 2009 07:59:28 -0400 User-Agent: KMail/1.9.7 References: <20090924224935.GW473@gandalf.sssup.it> <200909291731.32394.jhb@freebsd.org> <3bbf2fe10909291439x21f53e34n60d63554b1dea0de@mail.gmail.com> In-Reply-To: <3bbf2fe10909291439x21f53e34n60d63554b1dea0de@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200909300759.29141.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Wed, 30 Sep 2009 09:07:26 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.5 required=4.2 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: Ed Schouten , Max Laier , Roman Divacky , Fabio Checconi , freebsd-hackers@freebsd.org Subject: Re: sx locks and memory barriers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Sep 2009 13:07:28 -0000 On Tuesday 29 September 2009 5:39:43 pm Attilio Rao wrote: > 2009/9/29 John Baldwin : > > On Tuesday 29 September 2009 4:42:13 pm Attilio Rao wrote: > >> 2009/9/29 Max Laier : > >> > On Tuesday 29 September 2009 17:39:37 Attilio Rao wrote: > >> >> 2009/9/25 Fabio Checconi : > >> >> > Hi all, > >> >> > looking at sys/sx.h I have some troubles understanding this comment: > >> >> > > >> >> > * A note about memory barriers. Exclusive locks need to use the same > >> >> > * memory barriers as mutexes: _acq when acquiring an exclusive lock > >> >> > * and _rel when releasing an exclusive lock. On the other side, > >> >> > * shared lock needs to use an _acq barrier when acquiring the lock > >> >> > * but, since they don't update any locked data, no memory barrier is > >> >> > * needed when releasing a shared lock. > >> >> > > >> >> > In particular, I'm not understanding what prevents the following sequence > >> >> > from happening: > >> >> > > >> >> > CPU A CPU B > >> >> > > >> >> > sx_slock(&data->lock); > >> >> > > >> >> > sx_sunlock(&data->lock); > >> >> > > >> >> > /* reordered after the unlock > >> >> > by the cpu */ > >> >> > if (data->buffer) > >> >> > sx_xlock(&data->lock); > >> >> > free(data->buffer); > >> >> > data->buffer = NULL; > >> >> > sx_xunlock(&data->lock); > >> >> > > >> >> > a = *data->buffer; > >> >> > > >> >> > IOW, even if readers do not modify the data protected by the lock, > >> >> > without a release barrier a memory access may leak past the unlock (as > >> >> > the cpu won't notice any dependency between the unlock and the fetch, > >> >> > feeling free to reorder them), thus potentially racing with an exclusive > >> >> > writer accessing the data. > >> >> > > >> >> > On architectures where atomic ops serialize memory accesses this would > >> >> > never happen, otherwise the sequence above seems possible; am I missing > >> >> > something? > >> >> > >> >> I think your concerns are right, possibly we need this patch: > >> >> http://www.freebsd.org/~attilio/sxrw_unlockb.diff > >> >> > >> >> However speaking with John we agreed possibly there is a more serious > >> >> breakage. Possibly, memory barriers would also require to ensure the > >> >> compiler to not reorder the operation, while right now, in FreeBSD, they > >> >> just take care of the reordering from the architecture perspective. > >> >> The only way I'm aware of GCC offers that is to clobber memory. > >> >> I will provide a patch that address this soon, hoping that GCC will be > >> >> smart enough to not overhead too much the memory clobbering but just > >> >> try to understand what's our purpose and servers it (I will try to > >> >> compare code generated before and after the patch at least for tier-1 > >> >> architectures). > >> > > >> > Does GCC really reorder accesses to volatile objects? The C Standard seems to > >> > object: > >> > > >> > 5.1.2.3 - 2 > >> > Accessing a volatile object, modifying an object, modifying a file, or calling > >> > a function that does any of those operations are all side effects,11) which > >> > are changes in the state of the execution environment. Evaluation of an > >> > expression may produce side effects. At certain specified points in the > >> > execution sequence called sequence points, all side effects of previous > >> > evaluations shall be complete and no side effects of subsequent evaluations > >> > shall have taken place. (A summary of the sequence points is given in annex > >> > C.) > >> > >> Very interesting. > >> I was thinking about the other operating systems which basically do > >> 'memory clobbering' for ensuring a compiler barrier, but actually they > >> often forsee such a barrier without the conjuction of a memory > >> operand. > >> > >> I think I will need to speak a bit with a GCC engineer in order to see > >> what do they implement in regard of volatile operands. > > > > GCC can be quite aggressive with reordering even in the face of volatile. I > > was recently doing a hack to export some data from the kernel to userland > > that used a spin loop to grab a snapshot of the contents of a structure > > similar to the method used in the kernel with the timehands structures. It > > used a volatile structure exposed from the kernel that looked something > > like: > > > > struct foo { > > volatile int gen; > > /* other stuff */ > > }; > > > > volatile struct foo *p; > > > > do { > > x = p->gen; > > /* read other stuff */ > > y = p->gen; > > } while (x != y && x != 0); > > > > GCC moved the 'y = ' up into the middle of the '/* read other stuff */'. > > I eventually had to add explicit "memory" clobbers to force GCC to not > > move the reads of 'gen' around but do them "around" all the other > > operations, so that the working code is: > > > > do { > > x = p->gen; > > asm volatile("" ::: "memory"); > > /* read other stuff */ > > asm volatile("" ::: "memory"); > > y = p->gen; > > } while (x != y && x != 0); > > > > I see. > So probabilly clobbering memory is the only choice we have right now. > I will try to make a patch which also keeps into account the > possibility to skip it (or define by hand alternative approaches) for > different compilers. > I wonder, specifically, how llvm/clang relies with it. We already allow for different compilers to defined different versions of atomic_*(). I think all you need to do for now is just add "memory" clobbers to all of the atomic operations that have either an _acq or _rel memory barrier. -- John Baldwin