From owner-freebsd-current@FreeBSD.ORG Tue May 17 13:58:18 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 790731065672; Tue, 17 May 2011 13:58:18 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 35D058FC0A; Tue, 17 May 2011 13:58:18 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id C2BD646B06; Tue, 17 May 2011 09:58:17 -0400 (EDT) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 557528A04F; Tue, 17 May 2011 09:58:17 -0400 (EDT) From: John Baldwin To: Andriy Gapon Date: Tue, 17 May 2011 09:58:16 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110325; KDE/4.5.5; amd64; ; ) References: <4DCD357D.6000109@FreeBSD.org> <4DD26256.2070008@FreeBSD.org> <4DD27C3A.3040509@FreeBSD.org> In-Reply-To: <4DD27C3A.3040509@FreeBSD.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201105170958.16847.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (bigwig.baldwin.cx); Tue, 17 May 2011 09:58:17 -0400 (EDT) Cc: Max Laier , FreeBSD current , neel@freebsd.org, Peter Grehan Subject: Re: proposed smp_rendezvous change X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 May 2011 13:58:18 -0000 On Tuesday, May 17, 2011 9:46:34 am Andriy Gapon wrote: > on 17/05/2011 14:56 John Baldwin said the following: > > On 5/17/11 4:03 AM, Andriy Gapon wrote: > >> Couldn't [Shouldn't] the whole: > >> > >>>>> /* Ensure we have up-to-date values. */ > >>>>> atomic_add_acq_int(&smp_rv_waiters[0], 1); > >>>>> while (smp_rv_waiters[0]< smp_rv_ncpus) > >>>>> cpu_spinwait(); > >> > >> be just replaced with: > >> > >> rmb(); > >> > >> Or a proper MI function that does just a read memory barrier, if rmb() is not that. > > > > No, you could replace it with: > > > > atomic_add_acq_int(&smp_rv_waiters[0], 1); > > What about > (void)atomic_load_acq(&smp_rv_waiters[0]); > > In my opinion that should ensure that the hardware must post the latest value from > a master CPU to memory of smp_rv_waiters[0] and a slave CPU gets it from there. > And also, because of memory barriers inserted by store_rel on the master CPU and > load_acq on the slave CPU, the latest values of all other smp_rv_* fields should > become visible to the slave CPU. No, it doesn't quite work that way. It wouldn't work on Alpha for example. All load_acq is a load with a memory barrier to order other loads after it. It is still free to load stale data. Only a read-modify-write operation would actually block until it could access an up-to-date value. > > > The key being that atomic_add_acq_int() will block (either in hardware or > > software) until it can safely perform the atomic operation. That means waiting > > until the write to set smp_rv_waiters[0] to 0 by the rendezvous initiator is > > visible to the current CPU. > > > > On some platforms a write by one CPU may not post instantly to other CPUs (e.g. it > > may sit in a store buffer). That is fine so long as an attempt to update that > > value atomically (using cas or a conditional-store, etc.) fails. For those > > platforms, the atomic(9) API is required to spin until it succeeds. > > > > This is why the mtx code spins if it can't set MTX_CONTESTED for example. > > > > Thank you for the great explanation! > Taking sparc64 as an example, I think that atomic_load_acq uses a degenerate cas > call, which should take care of hardware synchronization. sparc64's load_acq() is stronger than the MI effect of load_acq(). On ia64 which uses ld.acq or Alpha (originally) which uses a membar and simple load, the guarantees are only what I stated above (and would not be sufficient). Note that Alpha borrowed heavily from MIPS, and the MIPS atomic implementation is mostly identical to the old Alpha one (using conditional stores, etc.). The MIPS atomic_load_acq(): #define ATOMIC_STORE_LOAD(WIDTH) \ static __inline uint##WIDTH##_t \ atomic_load_acq_##WIDTH(__volatile uint##WIDTH##_t *p) \ { \ uint##WIDTH##_t v; \ \ v = *p; \ mips_sync(); \ return (v); \ } \ -- John Baldwin