From owner-freebsd-current@FreeBSD.ORG Mon Jul 18 11:34:15 2011 Return-Path: Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D91B5106564A; Mon, 18 Jul 2011 11:34:15 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id DCA518FC12; Mon, 18 Jul 2011 11:34:14 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id OAA21726; Mon, 18 Jul 2011 14:34:11 +0300 (EEST) (envelope-from avg@FreeBSD.org) Message-ID: <4E241A33.2000009@FreeBSD.org> Date: Mon, 18 Jul 2011 14:34:11 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:5.0) Gecko/20110705 Thunderbird/5.0 MIME-Version: 1.0 To: Max Laier , John Baldwin References: <4DCD357D.6000109@FreeBSD.org> <4DCFE8FA.6080005@FreeBSD.org> <4DCFEE33.5090808@FreeBSD.org> <201105151209.13846.max@love2party.net> <4DCFFE52.1090002@FreeBSD.org> In-Reply-To: <4DCFFE52.1090002@FreeBSD.org> X-Enigmail-Version: 1.2pre Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: FreeBSD current Subject: Re: proposed smp_rendezvous change X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jul 2011 11:34:15 -0000 on 15/05/2011 19:24 Andriy Gapon said the following: > on 15/05/2011 19:09 Max Laier said the following: >> >> I don't think we ever intended to synchronize the local teardown part, and I >> believe that is the correct behavior for this API. >> >> This version is sufficiently close to what I have, so I am resonably sure that >> it will work for us. It seems, however, that if we move to check to after >> picking up the lock anyway, the generation approach has even less impact and I >> am starting to prefer that solution. >> >> Andriy, is there any reason why you'd prefer your approach over the generation >> version? > > No reason. And I even haven't said that I prefer it :-) > I just wanted to show and explain it as apparently there was some > misunderstanding about it. I think that generation count approach could even > have a little bit better performance while perhaps being a tiny bit less obvious. Resurrecting this conversation. Here's a link to the thread on gmane for your convenience: http://article.gmane.org/gmane.os.freebsd.current/132682 It seems that we haven't fully accounted for the special case of master CPU behavior in neither my proposed fix that added a new wait count nor in the committed fix that has added a generation count. This has been pointed out by kib at the recent mini-DevSummit in Kyiv. The problem can be illustrated by the following example: - the teardown function is a "real" function; that is, not NULL and not smp_rendevous_no_ barrier (sic) - the arg parameter is a non-NULL pointer - the teardown function actually accesses memory pointed to by the arg - the master CPU deallocates memory pointed by the arg right after smp_rendezvous() call returns (either via free(9) or by stack management, etc) Quoting the code: MPASS(generation == smp_rv_generation); atomic_add_int(&smp_rv_waiters[2], 1); if (local_teardown_func != smp_no_rendevous_barrier) { while (smp_rv_waiters[2] < smp_rv_ncpus && generation == smp_rv_generation) cpu_spinwait(); if (local_teardown_func != NULL) local_teardown_func(local_func_arg); } Generation count helps slave CPUs to not get stuck at the while loop, but it doesn't prevent stale/different arg being passed to the teardown function. So here is a patch, which is a variation of my earlier proposal, that should fix this issue: http://people.freebsd.org/~avg/smp_rendezvous.diff The patch is not tested yet. The patch undoes the smp_rv_generation change and introduces a new smp_rv_waiters[] element. Perhaps the code could be optimized by making waiting on smp_rv_waiters[3] conditional on some combination of values for the teardown and the arg, but I decided to follow the KISS principle here to make the code a little bit more robust. -- Andriy Gapon