From owner-freebsd-threads@FreeBSD.ORG Mon Apr 23 13:41:47 2012 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A453C106566B for ; Mon, 23 Apr 2012 13:41:47 +0000 (UTC) (envelope-from yfw.bsd@gmail.com) Received: from mail-gy0-f182.google.com (mail-gy0-f182.google.com [209.85.160.182]) by mx1.freebsd.org (Postfix) with ESMTP id 558178FC1F for ; Mon, 23 Apr 2012 13:41:47 +0000 (UTC) Received: by ghrr20 with SMTP id r20so7289280ghr.13 for ; Mon, 23 Apr 2012 06:41:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=1xxJUgwwNzSkFN43sdEZvfXGgv9aDVx6CKg8S1A0YYs=; b=K97fVtZd2fAebutljLPNtQLqnmVq1ZLJfcHpSlRQODqD/LVu3vzM5ZVQWzeByYGyOd IW3A3+TraqulWpHSiH2RY6dk6vwMcffFIjkZoaAh3T5Rl/jzUm9+Xi5TDzZSqB8oiH23 fqjwCAHYx7MZodcWsWRVmrj6iyhGsWIQLVyZ+aqXb9l1kQY3VfvDOW/asrWzBO/9Wn8d Y6ovV25VCNOdk6Nqk9YZpHmtGvtPtiiJ1KMR8G2KhBapV6KyjNP+Vzz05I6sRTjdfUTT Ino34AOKYsqcSKXdrNFTw8MfnvBDmabbapLhnH2li+dXU/nJlb3SS7njdAblnBS7CRjA hgKQ== MIME-Version: 1.0 Received: by 10.60.169.146 with SMTP id ae18mr18544067oec.36.1335188501619; Mon, 23 Apr 2012 06:41:41 -0700 (PDT) Received: by 10.60.125.135 with HTTP; Mon, 23 Apr 2012 06:41:41 -0700 (PDT) In-Reply-To: <20120423130343.GT2358@deviant.kiev.zoral.com.ua> References: <20120423084120.GD76983@zxy.spb.ru> <20120423094043.GS32749@zxy.spb.ru> <20120423113838.GT32749@zxy.spb.ru> <20120423120720.GS2358@deviant.kiev.zoral.com.ua> <20120423130343.GT2358@deviant.kiev.zoral.com.ua> Date: Mon, 23 Apr 2012 21:41:41 +0800 Message-ID: From: Fengwei yin To: Konstantin Belousov Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-threads@freebsd.org, jack.ren@intel.com Subject: Re: About the memory barrier in BSD libc X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Apr 2012 13:41:47 -0000 On Mon, Apr 23, 2012 at 9:03 PM, Konstantin Belousov wrote: > On Mon, Apr 23, 2012 at 08:33:05PM +0800, Fengwei yin wrote: >> On Mon, Apr 23, 2012 at 8:07 PM, Konstantin Belousov >> wrote: >> > On Mon, Apr 23, 2012 at 07:44:34PM +0800, Fengwei yin wrote: >> >> On Mon, Apr 23, 2012 at 7:38 PM, Slawa Olhovchenkov = wrote: >> >> > On Mon, Apr 23, 2012 at 07:26:54PM +0800, Fengwei yin wrote: >> >> > >> >> >> On Mon, Apr 23, 2012 at 5:40 PM, Slawa Olhovchenkov wrote: >> >> >> > On Mon, Apr 23, 2012 at 05:32:24PM +0800, Fengwei yin wrote: >> >> >> > >> >> >> >> On Mon, Apr 23, 2012 at 4:41 PM, Slawa Olhovchenkov wrote: >> >> >> >> > On Mon, Apr 23, 2012 at 02:56:03PM +0800, Fengwei yin wrote: >> >> >> >> > >> >> >> >> >> Hi list, >> >> >> >> >> If this is not correct question on the list, please let me k= now and >> >> >> >> >> sorry for noise. >> >> >> >> >> >> >> >> >> >> I have a question regarding the BSD libc for SMP arch. I did= n't see >> >> >> >> >> memory barrier used in libc. >> >> >> >> >> How can we make sure it's safe on SMP arch? >> >> >> >> > >> >> >> >> > /usr/include/machine/atomic.h: >> >> >> >> > >> >> >> >> > #define mb() =A0 =A0__asm __volatile("lock; addl $0,(%%esp)" = : : : "memory") >> >> >> >> > #define wmb() =A0 __asm __volatile("lock; addl $0,(%%esp)" : = : : "memory") >> >> >> >> > #define rmb() =A0 __asm __volatile("lock; addl $0,(%%esp)" : = : : "memory") >> >> >> >> > >> >> >> >> >> >> >> >> Thanks for the information. But it looks no body use it in libc= . >> >> >> > >> >> >> > I think no body in libc need memory barrier: libc don't work wit= h >> >> >> > peripheral, for atomic opertions used different macros. >> >> >> >> >> >> If we check the usage of __sinit(), it is a typical singleton patt= ern which >> >> >> needs memory barrier to make sure no potential SMP issue. >> >> >> >> >> >> Or did I miss something here? >> >> > >> >> > What architecture with cache incoherency and FreeBSD support? >> >> >> >> I suppose it's not related with cache inchoherency (I could be wrong)= . >> >> It's related >> >> with reorder of instruction by CPU. >> >> >> >> Here is the link talking about why need memory barrier for singleton: >> >> http://www.oaklib.org/docs/oak/singleton.html >> >> >> >> x86 has strict memory model and may not suffer this kind of issue. Bu= t >> >> ARM need to >> >> take care of it IMHO. >> > >> > Please note that __sinit is idempotent, so double-initialization is no= t >> > an issue there. The only possible problematic case would be other thre= ad >> > executing exit and not noticing non-NULL value for __cleanup while cur= rent >> > thread just set it. >> > >> > I am not sure how much real this race is. Each call to _sinit() is imm= ediately >> > followed by a lock acquire, typically FLOCKFILE(), which enforces full= barrier >> > semantic due to pthread_mutex_lock call. The exit() performs __cxa_fin= alize() >> > call before checking __cleanup value, and __cxa_finalize() itself lock= s >> > atexit_mutex. So the race is tiny and probably possible only for somew= hat >> > buggy applications which perform exit() while there are stdio operatio= ns >> > in progress. >> > >> > Also note that some functions assign to __cleanup unconditionally. >> > >> > Do you see any real issue due to non-synchronized access to __cleanup = ? >> >> No. I didn't see real issue. I am just reviewing the code. >> >> If you don't think __sinit has issue, let's check another code: >> =A0 =A0 =A0line 68 in libc/stdio/fclose.c >> =A0 =A0 =A0line 133 in libc/stdio/findfp.c (function __sfp()) >> >> Which is trying to free a fp slot by assign 0 to fp->_flags. But if >> the instrucation >> could be re-ordered, another CPU could see fp->_flags is assigned to 0 >> before the >> cleanup from line 57 to 67. >> >> Let's say, if another CPU is in line 133 of __sfp(), it could see >> fp->_flags become >> 0 before it's aware of the cleanup (Line 57 to line 67 in >> libc/stdio/fclose.c) happen. >> >> Note: the mutex of FUNLOCKFILE(fp) in line 69 of libc/stdio/fclose.c >> just could make sure >> line 70 happen after line 68. It can't impact the re-order of line 57 >> ~ line 68 by CPU. > > Yes, FUNLOCKFILE() there would have no effect on the potential CPU reorde= ring > of the writes. =A0But does the order of these writes matter at all ? > > Please note that __sfp() reinitializes all fields written by fclose(). > Only if CPU executing fclose() is allowed to reorder operations so that > the external effect of _flags =3D 0 assignment can be observed before tha= t > CPU executes other operations from fclose(), there could be a problem. > > This is definitely impossible on Intel, and I indeed do not know about > other architectures enough to reject such possibility. The _flags member > is short, so atomics cannot be used there. The easier solution, if this > is indeed an issue, is to lock thread_lock around _flags =3D 0 assignment > in fclose(). For Intel x86, I suppose it's not a problem because of its memory model. For IA64, ARM, it could be an issue in theory. For the solution: I suppose memory barrier is enough for it.