Date: Mon, 18 Jul 2011 12:20:32 -0400 From: John Baldwin <jhb@freebsd.org> To: Andriy Gapon <avg@freebsd.org> Cc: Max Laier <max@love2party.net>, FreeBSD current <freebsd-current@freebsd.org>, neel@freebsd.org Subject: Re: proposed smp_rendezvous change Message-ID: <201107181220.33052.jhb@freebsd.org> In-Reply-To: <4E241A33.2000009@FreeBSD.org> References: <4DCD357D.6000109@FreeBSD.org> <4DCFFE52.1090002@FreeBSD.org> <4E241A33.2000009@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Monday, July 18, 2011 7:34:11 am Andriy Gapon wrote: > on 15/05/2011 19:24 Andriy Gapon said the following: > > on 15/05/2011 19:09 Max Laier said the following: > >> > >> I don't think we ever intended to synchronize the local teardown part, and I > >> believe that is the correct behavior for this API. > >> > >> This version is sufficiently close to what I have, so I am resonably sure that > >> it will work for us. It seems, however, that if we move to check to after > >> picking up the lock anyway, the generation approach has even less impact and I > >> am starting to prefer that solution. > >> > >> Andriy, is there any reason why you'd prefer your approach over the generation > >> version? > > > > No reason. And I even haven't said that I prefer it :-) > > I just wanted to show and explain it as apparently there was some > > misunderstanding about it. I think that generation count approach could even > > have a little bit better performance while perhaps being a tiny bit less obvious. > > Resurrecting this conversation. > Here's a link to the thread on gmane for your convenience: > http://article.gmane.org/gmane.os.freebsd.current/132682 > > It seems that we haven't fully accounted for the special case of master CPU > behavior in neither my proposed fix that added a new wait count nor in the > committed fix that has added a generation count. > This has been pointed out by kib at the recent mini-DevSummit in Kyiv. > > The problem can be illustrated by the following example: > - the teardown function is a "real" function; that is, not NULL and not > smp_rendevous_no_ barrier (sic) > - the arg parameter is a non-NULL pointer > - the teardown function actually accesses memory pointed to by the arg > - the master CPU deallocates memory pointed by the arg right after > smp_rendezvous() call returns (either via free(9) or by stack management, etc) > > Quoting the code: > MPASS(generation == smp_rv_generation); > atomic_add_int(&smp_rv_waiters[2], 1); > if (local_teardown_func != smp_no_rendevous_barrier) { > while (smp_rv_waiters[2] < smp_rv_ncpus && > generation == smp_rv_generation) > cpu_spinwait(); > > if (local_teardown_func != NULL) > local_teardown_func(local_func_arg); > } > > Generation count helps slave CPUs to not get stuck at the while loop, but it > doesn't prevent stale/different arg being passed to the teardown function. > > So here is a patch, which is a variation of my earlier proposal, that should fix > this issue: > http://people.freebsd.org/~avg/smp_rendezvous.diff > The patch is not tested yet. > > The patch undoes the smp_rv_generation change and introduces a new > smp_rv_waiters[] element. Perhaps the code could be optimized by making waiting > on smp_rv_waiters[3] conditional on some combination of values for the teardown > and the arg, but I decided to follow the KISS principle here to make the code a > little bit more robust. Yes, I prefer the more robust case actually as it is clearer. This is fine with me. It would be good to get Max's feedback and possibly Neel's as well. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201107181220.33052.jhb>