From owner-freebsd-current@FreeBSD.ORG Sun May 15 14:53:49 2011 Return-Path: Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4A8D2106566B; Sun, 15 May 2011 14:53:49 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 29A928FC14; Sun, 15 May 2011 14:53:47 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id RAA12928; Sun, 15 May 2011 17:53:46 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1QLchW-0000Cy-Je; Sun, 15 May 2011 17:53:46 +0300 Message-ID: <4DCFE8FA.6080005@FreeBSD.org> Date: Sun, 15 May 2011 17:53:46 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.17) Gecko/20110503 Lightning/1.0b2 Thunderbird/3.1.10 MIME-Version: 1.0 To: John Baldwin , Peter Grehan , Max Laier References: <4DCD357D.6000109@FreeBSD.org> <4DCE9EF0.3050803@FreeBSD.org> <4DCF7CF0.1080508@FreeBSD.org> In-Reply-To: <4DCF7CF0.1080508@FreeBSD.org> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: FreeBSD current Subject: Re: proposed smp_rendezvous change X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 15 May 2011 14:53:49 -0000 on 15/05/2011 10:12 Andriy Gapon said the following: > on 14/05/2011 18:25 John Baldwin said the following: >> Hmmm, so this is not actually sufficient. NetApp ran into a very similar race >> with virtual CPUs in BHyVe. In their case because virtual CPUs are threads that >> can be preempted, they have a chance at a longer race. >> >> The problem that they see is that even though the values have been updated, the >> next CPU to start a rendezvous can clear smp_rv_waiters[2] to zero before one of >> the other CPUs notices that it has finished. > > As a follow up to my previous question. Have you noticed that in my patch no > slave CPU actually waits/spins on smp_rv_waiters[2]? It's always only master > CPU (and under smp_ipi_mtx). > Here's a cleaner version of my approach to the fix. This one does not remove the initial wait on smp_rv_waiters[0] in smp_rendezvous_action() and thus does not renumber all smp_rv_waiters[] members and thus hopefully should be clearer. Index: sys/kern/subr_smp.c =================================================================== --- sys/kern/subr_smp.c (revision 221943) +++ sys/kern/subr_smp.c (working copy) @@ -110,7 +110,7 @@ static void (*volatile smp_rv_setup_func)(void *ar static void (*volatile smp_rv_action_func)(void *arg); static void (*volatile smp_rv_teardown_func)(void *arg); static void *volatile smp_rv_func_arg; -static volatile int smp_rv_waiters[3]; +static volatile int smp_rv_waiters[4]; /* * Shared mutex to restrict busywaits between smp_rendezvous() and @@ -338,11 +338,15 @@ smp_rendezvous_action(void) /* spin on exit rendezvous */ atomic_add_int(&smp_rv_waiters[2], 1); - if (local_teardown_func == smp_no_rendevous_barrier) + if (local_teardown_func == smp_no_rendevous_barrier) { + atomic_add_int(&smp_rv_waiters[3], 1); return; + } while (smp_rv_waiters[2] < smp_rv_ncpus) cpu_spinwait(); + atomic_add_int(&smp_rv_waiters[3], 1); + /* teardown function */ if (local_teardown_func != NULL) local_teardown_func(local_func_arg); @@ -377,6 +381,9 @@ smp_rendezvous_cpus(cpumask_t map, /* obtain rendezvous lock */ mtx_lock_spin(&smp_ipi_mtx); + while (smp_rv_waiters[3] < smp_rv_ncpus) + cpu_spinwait(); + /* set static function pointers */ smp_rv_ncpus = ncpus; smp_rv_setup_func = setup_func; @@ -385,6 +392,7 @@ smp_rendezvous_cpus(cpumask_t map, smp_rv_func_arg = arg; smp_rv_waiters[1] = 0; smp_rv_waiters[2] = 0; + smp_rv_waiters[3] = 0; atomic_store_rel_int(&smp_rv_waiters[0], 0); /* signal other processors, which will enter the IPI with interrupts off */ -- Andriy Gapon