Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 29 Sep 2009 22:39:43 +0200
From:      Attilio Rao <attilio@freebsd.org>
To:        John Baldwin <jhb@freebsd.org>
Cc:        hackers@freebsd.org, Fabio Checconi <fabio@freebsd.org>
Subject:   Re: sx locks and memory barriers
Message-ID:  <3bbf2fe10909291339s5705a9bendb4c9331293b45a4@mail.gmail.com>
In-Reply-To: <200909291628.15516.jhb@freebsd.org>
References:  <20090924224935.GW473@gandalf.sssup.it> <200909291425.46134.jhb@freebsd.org> <3bbf2fe10909291215i2bdd73aj13c1ac433152cab4@mail.gmail.com> <200909291628.15516.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
2009/9/29 John Baldwin <jhb@freebsd.org>:
> On Tuesday 29 September 2009 3:15:40 pm Attilio Rao wrote:
>> 2009/9/29 John Baldwin <jhb@freebsd.org>:
>> > On Tuesday 29 September 2009 11:39:37 am Attilio Rao wrote:
>> >> 2009/9/25 Fabio Checconi <fabio@freebsd.org>:
>> >> > Hi all,
>> >> >  looking at sys/sx.h I have some troubles understanding this comment:
>> >> >
>> >> >  * A note about memory barriers.  Exclusive locks need to use the same
>> >> >  * memory barriers as mutexes: _acq when acquiring an exclusive lock
>> >> >  * and _rel when releasing an exclusive lock.  On the other side,
>> >> >  * shared lock needs to use an _acq barrier when acquiring the lock
>> >> >  * but, since they don't update any locked data, no memory barrier is
>> >> >  * needed when releasing a shared lock.
>> >> >
>> >> > In particular, I'm not understanding what prevents the following sequence
>> >> > from happening:
>> >> >
>> >> > CPU A                                   CPU B
>> >> >
>> >> > sx_slock(&data->lock);
>> >> >
>> >> > sx_sunlock(&data->lock);
>> >> >
>> >> > /* reordered after the unlock
>> >> >   by the cpu */
>> >> > if (data->buffer)
>> >> >                                        sx_xlock(&data->lock);
>> >> >                                        free(data->buffer);
>> >> >                                        data->buffer = NULL;
>> >> >                                        sx_xunlock(&data->lock);
>> >> >
>> >> >        a = *data->buffer;
>> >> >
>> >> > IOW, even if readers do not modify the data protected by the lock,
>> >> > without a release barrier a memory access may leak past the unlock (as
>> >> > the cpu won't notice any dependency between the unlock and the fetch,
>> >> > feeling free to reorder them), thus potentially racing with an exclusive
>> >> > writer accessing the data.
>> >> >
>> >> > On architectures where atomic ops serialize memory accesses this would
>> >> > never happen, otherwise the sequence above seems possible; am I missing
>> >> > something?
>> >>
>> >> I think your concerns are right, possibly we need this patch:
>> >> http://www.freebsd.org/~attilio/sxrw_unlockb.diff
>> >
>> > Actually, since you are only worried about reads, I think this should be
>> > an "acq" barrier rather than a "rel".  In some cases "acq" is cheaper, so we
>> > should prefer the cheapest barrier that provides what we need.  You would
>> > still need to keep some language about the memory barriers since using "acq"
>> > for shared unlocking is different from exclusive unlocking.
>>
>> Actually, I don't think that an acq barrier ensures enough protection
>> against the reordering of 'earlier' operation thus not fixing the
>> architecture ordering problem reported by Fabio. Also, I don't think
>> we just have to care about reads (or  I don't understand what you mean
>> here).
>
> Hmmm, it might not on certain archs.  It does on x86 (i.e. an lfence would
> work here on x86), but probably not on ia64 and sparc64.  Also, we certainly
> only care about reads.  A read/share lock cannot resolve any races where the
> lock holder is writing data, it can only ensure that the lock holder can
> safely read shared data without the data changing out from underneath it.
>
>> However, I'm not even sure that we have faster read barriers than the
>> write one. As long as it should be true in theory I don't think that's
>> what happen in practice.
>
> bde@ found that sfence was generally much more expensive than lfence on x86.
> However, since x86 guarantees the order of all stores we don't actually need
> to use sfence at all on x86 anyway.

Yes, x86 guarantees that the stores are strong ordered so I don't
think acq to be faster than rel.
Can I assume the patch I already sent as reviewed by you and commit then, right?

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3bbf2fe10909291339s5705a9bendb4c9331293b45a4>