Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 23 Apr 2012 21:41:41 +0800
From:      Fengwei yin <yfw.bsd@gmail.com>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        freebsd-threads@freebsd.org, jack.ren@intel.com
Subject:   Re: About the memory barrier in BSD libc
Message-ID:  <CAPHpMu=LAqmZJwKO65N0VOTVsoFpJYAvYmxndFNt4S_y5P=NQg@mail.gmail.com>
In-Reply-To: <20120423130343.GT2358@deviant.kiev.zoral.com.ua>
References:  <CAPHpMu=DOGQ=TuFeYH7bH8hVwteT4Q3k67-mvoOFob6P3Y506w@mail.gmail.com> <20120423084120.GD76983@zxy.spb.ru> <CAPHpMu=kCwhf1RV_sYBDWDPL8368YTMLXge4L_g_F4AkTX1H5g@mail.gmail.com> <20120423094043.GS32749@zxy.spb.ru> <CAPHpMukLUeetSKpH2oiKJQ3ML_PFHEi6a0hK3_Ery=LX1YEd3g@mail.gmail.com> <20120423113838.GT32749@zxy.spb.ru> <CAPHpMumWu_aaZ4Sj5Athro6441Y%2B3_phbD2jxkKE-CdBf-Fd8g@mail.gmail.com> <20120423120720.GS2358@deviant.kiev.zoral.com.ua> <CAPHpMumh3YpB3RDD-7g5tU6thiuNA6HTuVxmt-9_OzUiEdEXzA@mail.gmail.com> <20120423130343.GT2358@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Apr 23, 2012 at 9:03 PM, Konstantin Belousov
<kostikbel@gmail.com> wrote:
> On Mon, Apr 23, 2012 at 08:33:05PM +0800, Fengwei yin wrote:
>> On Mon, Apr 23, 2012 at 8:07 PM, Konstantin Belousov
>> <kostikbel@gmail.com> wrote:
>> > On Mon, Apr 23, 2012 at 07:44:34PM +0800, Fengwei yin wrote:
>> >> On Mon, Apr 23, 2012 at 7:38 PM, Slawa Olhovchenkov <slw@zxy.spb.ru> =
wrote:
>> >> > On Mon, Apr 23, 2012 at 07:26:54PM +0800, Fengwei yin wrote:
>> >> >
>> >> >> On Mon, Apr 23, 2012 at 5:40 PM, Slawa Olhovchenkov <slw@zxy.spb.r=
u> wrote:
>> >> >> > On Mon, Apr 23, 2012 at 05:32:24PM +0800, Fengwei yin wrote:
>> >> >> >
>> >> >> >> On Mon, Apr 23, 2012 at 4:41 PM, Slawa Olhovchenkov <slw@zxy.sp=
b.ru> wrote:
>> >> >> >> > On Mon, Apr 23, 2012 at 02:56:03PM +0800, Fengwei yin wrote:
>> >> >> >> >
>> >> >> >> >> Hi list,
>> >> >> >> >> If this is not correct question on the list, please let me k=
now and
>> >> >> >> >> sorry for noise.
>> >> >> >> >>
>> >> >> >> >> I have a question regarding the BSD libc for SMP arch. I did=
n't see
>> >> >> >> >> memory barrier used in libc.
>> >> >> >> >> How can we make sure it's safe on SMP arch?
>> >> >> >> >
>> >> >> >> > /usr/include/machine/atomic.h:
>> >> >> >> >
>> >> >> >> > #define mb() =A0 =A0__asm __volatile("lock; addl $0,(%%esp)" =
: : : "memory")
>> >> >> >> > #define wmb() =A0 __asm __volatile("lock; addl $0,(%%esp)" : =
: : "memory")
>> >> >> >> > #define rmb() =A0 __asm __volatile("lock; addl $0,(%%esp)" : =
: : "memory")
>> >> >> >> >
>> >> >> >>
>> >> >> >> Thanks for the information. But it looks no body use it in libc=
.
>> >> >> >
>> >> >> > I think no body in libc need memory barrier: libc don't work wit=
h
>> >> >> > peripheral, for atomic opertions used different macros.
>> >> >>
>> >> >> If we check the usage of __sinit(), it is a typical singleton patt=
ern which
>> >> >> needs memory barrier to make sure no potential SMP issue.
>> >> >>
>> >> >> Or did I miss something here?
>> >> >
>> >> > What architecture with cache incoherency and FreeBSD support?
>> >>
>> >> I suppose it's not related with cache inchoherency (I could be wrong)=
.
>> >> It's related
>> >> with reorder of instruction by CPU.
>> >>
>> >> Here is the link talking about why need memory barrier for singleton:
>> >> http://www.oaklib.org/docs/oak/singleton.html
>> >>
>> >> x86 has strict memory model and may not suffer this kind of issue. Bu=
t
>> >> ARM need to
>> >> take care of it IMHO.
>> >
>> > Please note that __sinit is idempotent, so double-initialization is no=
t
>> > an issue there. The only possible problematic case would be other thre=
ad
>> > executing exit and not noticing non-NULL value for __cleanup while cur=
rent
>> > thread just set it.
>> >
>> > I am not sure how much real this race is. Each call to _sinit() is imm=
ediately
>> > followed by a lock acquire, typically FLOCKFILE(), which enforces full=
 barrier
>> > semantic due to pthread_mutex_lock call. The exit() performs __cxa_fin=
alize()
>> > call before checking __cleanup value, and __cxa_finalize() itself lock=
s
>> > atexit_mutex. So the race is tiny and probably possible only for somew=
hat
>> > buggy applications which perform exit() while there are stdio operatio=
ns
>> > in progress.
>> >
>> > Also note that some functions assign to __cleanup unconditionally.
>> >
>> > Do you see any real issue due to non-synchronized access to __cleanup =
?
>>
>> No. I didn't see real issue. I am just reviewing the code.
>>
>> If you don't think __sinit has issue, let's check another code:
>> =A0 =A0 =A0line 68 in libc/stdio/fclose.c
>> =A0 =A0 =A0line 133 in libc/stdio/findfp.c (function __sfp())
>>
>> Which is trying to free a fp slot by assign 0 to fp->_flags. But if
>> the instrucation
>> could be re-ordered, another CPU could see fp->_flags is assigned to 0
>> before the
>> cleanup from line 57 to 67.
>>
>> Let's say, if another CPU is in line 133 of __sfp(), it could see
>> fp->_flags become
>> 0 before it's aware of the cleanup (Line 57 to line 67 in
>> libc/stdio/fclose.c) happen.
>>
>> Note: the mutex of FUNLOCKFILE(fp) in line 69 of libc/stdio/fclose.c
>> just could make sure
>> line 70 happen after line 68. It can't impact the re-order of line 57
>> ~ line 68 by CPU.
>
> Yes, FUNLOCKFILE() there would have no effect on the potential CPU reorde=
ring
> of the writes. =A0But does the order of these writes matter at all ?
>
> Please note that __sfp() reinitializes all fields written by fclose().
> Only if CPU executing fclose() is allowed to reorder operations so that
> the external effect of _flags =3D 0 assignment can be observed before tha=
t
> CPU executes other operations from fclose(), there could be a problem.
>
> This is definitely impossible on Intel, and I indeed do not know about
> other architectures enough to reject such possibility. The _flags member
> is short, so atomics cannot be used there. The easier solution, if this
> is indeed an issue, is to lock thread_lock around _flags =3D 0 assignment
> in fclose().

For Intel x86, I suppose it's not a problem because of its memory model.
For IA64, ARM, it could be an issue in theory.

For the solution: I suppose memory barrier is enough for it.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAPHpMu=LAqmZJwKO65N0VOTVsoFpJYAvYmxndFNt4S_y5P=NQg>