From owner-freebsd-current@FreeBSD.ORG Sun May 15 15:16:05 2011 Return-Path: Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6EA4C1065672; Sun, 15 May 2011 15:16:05 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 2EA758FC0A; Sun, 15 May 2011 15:16:05 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id BBCA046B23; Sun, 15 May 2011 11:16:04 -0400 (EDT) Received: from John-Baldwins-Macbook-Pro.local (unknown [192.75.139.251]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 0E8F78A04F; Sun, 15 May 2011 11:16:04 -0400 (EDT) Message-ID: <4DCFEE33.5090808@FreeBSD.org> Date: Sun, 15 May 2011 11:16:03 -0400 From: John Baldwin User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10 MIME-Version: 1.0 To: Andriy Gapon References: <4DCD357D.6000109@FreeBSD.org> <4DCE9EF0.3050803@FreeBSD.org> <4DCF7CF0.1080508@FreeBSD.org> <4DCFE8FA.6080005@FreeBSD.org> In-Reply-To: <4DCFE8FA.6080005@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (bigwig.baldwin.cx); Sun, 15 May 2011 11:16:04 -0400 (EDT) Cc: Max Laier , FreeBSD current , neel@FreeBSD.org, Peter Grehan Subject: Re: proposed smp_rendezvous change X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 15 May 2011 15:16:05 -0000 On 5/15/11 10:53 AM, Andriy Gapon wrote: > on 15/05/2011 10:12 Andriy Gapon said the following: >> on 14/05/2011 18:25 John Baldwin said the following: >>> Hmmm, so this is not actually sufficient. NetApp ran into a very similar race >>> with virtual CPUs in BHyVe. In their case because virtual CPUs are threads that >>> can be preempted, they have a chance at a longer race. >>> >>> The problem that they see is that even though the values have been updated, the >>> next CPU to start a rendezvous can clear smp_rv_waiters[2] to zero before one of >>> the other CPUs notices that it has finished. >> >> As a follow up to my previous question. Have you noticed that in my patch no >> slave CPU actually waits/spins on smp_rv_waiters[2]? It's always only master >> CPU (and under smp_ipi_mtx). >> > > Here's a cleaner version of my approach to the fix. > This one does not remove the initial wait on smp_rv_waiters[0] in > smp_rendezvous_action() and thus does not renumber all smp_rv_waiters[] members > and thus hopefully should be clearer. > > Index: sys/kern/subr_smp.c > =================================================================== > --- sys/kern/subr_smp.c (revision 221943) > +++ sys/kern/subr_smp.c (working copy) > @@ -110,7 +110,7 @@ static void (*volatile smp_rv_setup_func)(void *ar > static void (*volatile smp_rv_action_func)(void *arg); > static void (*volatile smp_rv_teardown_func)(void *arg); > static void *volatile smp_rv_func_arg; > -static volatile int smp_rv_waiters[3]; > +static volatile int smp_rv_waiters[4]; > > /* > * Shared mutex to restrict busywaits between smp_rendezvous() and > @@ -338,11 +338,15 @@ smp_rendezvous_action(void) > > /* spin on exit rendezvous */ > atomic_add_int(&smp_rv_waiters[2], 1); > - if (local_teardown_func == smp_no_rendevous_barrier) > + if (local_teardown_func == smp_no_rendevous_barrier) { > + atomic_add_int(&smp_rv_waiters[3], 1); > return; > + } > while (smp_rv_waiters[2]< smp_rv_ncpus) > cpu_spinwait(); > > + atomic_add_int(&smp_rv_waiters[3], 1); > + > /* teardown function */ > if (local_teardown_func != NULL) > local_teardown_func(local_func_arg); > @@ -377,6 +381,9 @@ smp_rendezvous_cpus(cpumask_t map, > /* obtain rendezvous lock */ > mtx_lock_spin(&smp_ipi_mtx); > > + while (smp_rv_waiters[3]< smp_rv_ncpus) > + cpu_spinwait(); > + > /* set static function pointers */ > smp_rv_ncpus = ncpus; > smp_rv_setup_func = setup_func; > @@ -385,6 +392,7 @@ smp_rendezvous_cpus(cpumask_t map, > smp_rv_func_arg = arg; > smp_rv_waiters[1] = 0; > smp_rv_waiters[2] = 0; > + smp_rv_waiters[3] = 0; > atomic_store_rel_int(&smp_rv_waiters[0], 0); > > /* signal other processors, which will enter the IPI with interrupts off */ Ahh, so the bump is after the change. I do think this will still be ok and I probably just didn't explain it well to Neel. I wonder though if the bump shouldn't happen until after the call of the local teardown function? -- John Baldwin