Date: Tue, 24 Apr 2012 17:03:48 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: Martin Simmons <martin@lispworks.com> Cc: freebsd-threads@freebsd.org, jack.ren@intel.com Subject: Re: About the memory barrier in BSD libc Message-ID: <20120424140348.GY2358@deviant.kiev.zoral.com.ua> In-Reply-To: <201204241343.q3ODhe2C032683@higson.cam.lispworks.com> References: <20120423084120.GD76983@zxy.spb.ru> <CAPHpMu=kCwhf1RV_sYBDWDPL8368YTMLXge4L_g_F4AkTX1H5g@mail.gmail.com> <20120423094043.GS32749@zxy.spb.ru> <CAPHpMukLUeetSKpH2oiKJQ3ML_PFHEi6a0hK3_Ery=LX1YEd3g@mail.gmail.com> <20120423113838.GT32749@zxy.spb.ru> <CAPHpMumWu_aaZ4Sj5Athro6441Y%2B3_phbD2jxkKE-CdBf-Fd8g@mail.gmail.com> <20120423120720.GS2358@deviant.kiev.zoral.com.ua> <CAPHpMumh3YpB3RDD-7g5tU6thiuNA6HTuVxmt-9_OzUiEdEXzA@mail.gmail.com> <20120423130343.GT2358@deviant.kiev.zoral.com.ua> <201204241343.q3ODhe2C032683@higson.cam.lispworks.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--yDxN68y6wlbaMG9t Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Apr 24, 2012 at 02:43:40PM +0100, Martin Simmons wrote: > >>>>> On Mon, 23 Apr 2012 16:03:43 +0300, Konstantin Belousov said: > >=20 > > On Mon, Apr 23, 2012 at 08:33:05PM +0800, Fengwei yin wrote: > > > On Mon, Apr 23, 2012 at 8:07 PM, Konstantin Belousov > > > <kostikbel@gmail.com> wrote: > > > > On Mon, Apr 23, 2012 at 07:44:34PM +0800, Fengwei yin wrote: > > > >> On Mon, Apr 23, 2012 at 7:38 PM, Slawa Olhovchenkov <slw@zxy.spb.r= u> wrote: > > > >> > On Mon, Apr 23, 2012 at 07:26:54PM +0800, Fengwei yin wrote: > > > >> > > > > >> >> On Mon, Apr 23, 2012 at 5:40 PM, Slawa Olhovchenkov <slw@zxy.sp= b.ru> wrote: > > > >> >> > On Mon, Apr 23, 2012 at 05:32:24PM +0800, Fengwei yin wrote: > > > >> >> > > > > >> >> >> On Mon, Apr 23, 2012 at 4:41 PM, Slawa Olhovchenkov <slw@zxy= .spb.ru> wrote: > > > >> >> >> > On Mon, Apr 23, 2012 at 02:56:03PM +0800, Fengwei yin wrot= e: > > > >> >> >> > > > > >> >> >> >> Hi list, > > > >> >> >> >> If this is not correct question on the list, please let m= e know and > > > >> >> >> >> sorry for noise. > > > >> >> >> >> > > > >> >> >> >> I have a question regarding the BSD libc for SMP arch. I = didn't see > > > >> >> >> >> memory barrier used in libc. > > > >> >> >> >> How can we make sure it's safe on SMP arch? > > > >> >> >> > > > > >> >> >> > /usr/include/machine/atomic.h: > > > >> >> >> > > > > >> >> >> > #define mb() ?? ??__asm __volatile("lock; addl $0,(%%esp)"= : : : "memory") > > > >> >> >> > #define wmb() ?? __asm __volatile("lock; addl $0,(%%esp)" = : : : "memory") > > > >> >> >> > #define rmb() ?? __asm __volatile("lock; addl $0,(%%esp)" = : : : "memory") > > > >> >> >> > > > > >> >> >> > > > >> >> >> Thanks for the information. But it looks no body use it in l= ibc. > > > >> >> > > > > >> >> > I think no body in libc need memory barrier: libc don't work = with > > > >> >> > peripheral, for atomic opertions used different macros. > > > >> >> > > > >> >> If we check the usage of __sinit(), it is a typical singleton p= attern which > > > >> >> needs memory barrier to make sure no potential SMP issue. > > > >> >> > > > >> >> Or did I miss something here? > > > >> > > > > >> > What architecture with cache incoherency and FreeBSD support? > > > >> > > > >> I suppose it's not related with cache inchoherency (I could be wro= ng). > > > >> It's related > > > >> with reorder of instruction by CPU. > > > >> > > > >> Here is the link talking about why need memory barrier for singlet= on: > > > >> http://www.oaklib.org/docs/oak/singleton.html > > > >> > > > >> x86 has strict memory model and may not suffer this kind of issue.= But > > > >> ARM need to > > > >> take care of it IMHO. > > > > > > > > Please note that __sinit is idempotent, so double-initialization is= not > > > > an issue there. The only possible problematic case would be other t= hread > > > > executing exit and not noticing non-NULL value for __cleanup while = current > > > > thread just set it. > > > > > > > > I am not sure how much real this race is. Each call to _sinit() is = immediately > > > > followed by a lock acquire, typically FLOCKFILE(), which enforces f= ull barrier > > > > semantic due to pthread_mutex_lock call. The exit() performs __cxa_= finalize() > > > > call before checking __cleanup value, and __cxa_finalize() itself l= ocks > > > > atexit_mutex. So the race is tiny and probably possible only for so= mewhat > > > > buggy applications which perform exit() while there are stdio opera= tions > > > > in progress. > > > > > > > > Also note that some functions assign to __cleanup unconditionally. > > > > > > > > Do you see any real issue due to non-synchronized access to __clean= up ? > > >=20 > > > No. I didn't see real issue. I am just reviewing the code. > > >=20 > > > If you don't think __sinit has issue, let's check another code: > > > line 68 in libc/stdio/fclose.c > > > line 133 in libc/stdio/findfp.c (function __sfp()) > > >=20 > > > Which is trying to free a fp slot by assign 0 to fp->_flags. But if > > > the instrucation > > > could be re-ordered, another CPU could see fp->_flags is assigned to 0 > > > before the > > > cleanup from line 57 to 67. > > >=20 > > > Let's say, if another CPU is in line 133 of __sfp(), it could see > > > fp->_flags become > > > 0 before it's aware of the cleanup (Line 57 to line 67 in > > > libc/stdio/fclose.c) happen. > > >=20 > > > Note: the mutex of FUNLOCKFILE(fp) in line 69 of libc/stdio/fclose.c > > > just could make sure > > > line 70 happen after line 68. It can't impact the re-order of line 57 > > > ~ line 68 by CPU. > >=20 > > Yes, FUNLOCKFILE() there would have no effect on the potential CPU reor= dering > > of the writes. But does the order of these writes matter at all ? > >=20 > > Please note that __sfp() reinitializes all fields written by fclose(). > > Only if CPU executing fclose() is allowed to reorder operations so that > > the external effect of _flags =3D 0 assignment can be observed before t= hat > > CPU executes other operations from fclose(), there could be a problem. > >=20 > > This is definitely impossible on Intel, and I indeed do not know about > > other architectures enough to reject such possibility. The _flags member > > is short, so atomics cannot be used there. The easier solution, if this > > is indeed an issue, is to lock thread_lock around _flags =3D 0 assignme= nt > > in fclose(). >=20 > This can be a problem, even on Intel, because the compiler can reorder the > stores. E.g. if I compile the following with gcc -O4 on amd64: >=20 > struct foo { int x, y; }; >=20 > int foo(struct foo *p) > { > int x =3D bar(); > p->y =3D baz(); > p->x =3D x; > } >=20 > then I get the following assembly language, which sets p->x before p->y: >=20 > movq %rdi, %rbx > call bar > movl %eax, %ebp > xorl %eax, %eax > call baz > movl %ebp, (%rbx) > movl %eax, 4(%rbx) >=20 > __Martin Ok, as I already said, I think that the reordering is safe there. Anyway, the change below should remove all concerns. diff --git a/lib/libc/stdio/fclose.c b/lib/libc/stdio/fclose.c index f0629e8..383040e 100644 --- a/lib/libc/stdio/fclose.c +++ b/lib/libc/stdio/fclose.c @@ -41,9 +41,12 @@ __FBSDID("$FreeBSD$"); #include <stdio.h> #include <stdlib.h> #include "un-namespace.h" +#include <spinlock.h> #include "libc_private.h" #include "local.h" =20 +extern spinlock_t __stdio_thread_lock; + int fclose(FILE *fp) { @@ -65,7 +68,11 @@ fclose(FILE *fp) FREELB(fp); fp->_file =3D -1; fp->_r =3D fp->_w =3D 0; /* Mess up if reaccessed. */ + if (__isthreaded) + _SPINLOCK(&__stdio_thread_lock); fp->_flags =3D 0; /* Release this FILE for reuse. */ + if (__isthreaded) + _SPINUNLOCK(&__stdio_thread_lock); FUNLOCKFILE(fp); return (r); } diff --git a/lib/libc/stdio/findfp.c b/lib/libc/stdio/findfp.c index 89c0536..bcd6f62 100644 --- a/lib/libc/stdio/findfp.c +++ b/lib/libc/stdio/findfp.c @@ -82,9 +82,9 @@ static struct glue *lastglue =3D &uglue; =20 static struct glue * moreglue(int); =20 -static spinlock_t thread_lock =3D _SPINLOCK_INITIALIZER; -#define THREAD_LOCK() if (__isthreaded) _SPINLOCK(&thread_lock) -#define THREAD_UNLOCK() if (__isthreaded) _SPINUNLOCK(&thread_lock) +spinlock_t __stdio_thread_lock =3D _SPINLOCK_INITIALIZER; +#define THREAD_LOCK() if (__isthreaded) _SPINLOCK(&__stdio_thread_lock) +#define THREAD_UNLOCK() if (__isthreaded) _SPINUNLOCK(&__stdio_thread_lock) =20 #if NOT_YET #define SET_GLUE_PTR(ptr, val) atomic_set_rel_ptr(&(ptr), (uintptr_t)(val)) --yDxN68y6wlbaMG9t Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAk+WssQACgkQC3+MBN1Mb4g+hACguHQ9O3LLzcvc8DuzymjOmaeg JFAAoLF1xp2cXY6dvSf7dLsk0X1X9VeY =XzVd -----END PGP SIGNATURE----- --yDxN68y6wlbaMG9t--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120424140348.GY2358>