Date: Sun, 15 May 2011 10:10:13 +0300 From: Andriy Gapon <avg@FreeBSD.org> To: John Baldwin <jhb@FreeBSD.org> Cc: FreeBSD current <freebsd-current@FreeBSD.org>, Peter Grehan <grehan@FreeBSD.org> Subject: Re: proposed smp_rendezvous change Message-ID: <4DCF7C55.3030404@FreeBSD.org> In-Reply-To: <4DCE9EF0.3050803@FreeBSD.org> References: <4DCD357D.6000109@FreeBSD.org> <4DCE9EF0.3050803@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
on 14/05/2011 18:25 John Baldwin said the following: > On 5/13/11 9:43 AM, Andriy Gapon wrote: >> >> This is a change in vein of what I've been doing in the xcpu branch and it's >> supposed to fix the issue by the recent commit that (probably unintentionally) >> stress-tests smp_rendezvous in TSC code. >> >> Non-essential changes: >> - ditch initial, and in my opinion useless, pre-setup rendezvous in >> smp_rendezvous_action() > > As long as IPIs ensure all data is up to date (I think this is certainly true on > x86) that is fine. Presumably sending an IPI has an implicit store barrier on > all other platforms as well? Well, one certainly can use IPIs as memory barrier, but my point was that we have other ways to have a memory barrier and using IPI for that was not necessary (and a little bit harmful to performance) in this case. >> Essential changes (the fix): >> - re-use freed smp_rv_waiters[2] to indicate that a slave/target is really fully >> done with rendezvous (i.e. it's not going to access any members of smp_rv_* >> pseudo-structure) >> - spin on smp_rv_waiters[2] upon _entry_ to smp_rendezvous_cpus() to not re-use >> the smp_rv_* pseudo-structure too early > > Hmmm, so this is not actually sufficient. NetApp ran into a very similar race > with virtual CPUs in BHyVe. In their case because virtual CPUs are threads that > can be preempted, they have a chance at a longer race. Just a quick question - have you noticed that because of the change above the smp_rv_waiters[2] of which I spoke was not the same smp_rv_waiters[2] as in the original cod? Because I "removed" smp_rv_waiters[0], smp_rv_waiters[2] is actually some new smp_rv_waiters[3]. And well, I think I described exactly the same scenario as you did in my email on the svn mailing list. So of course I had it in mind: http://www.mail-archive.com/svn-src-all@freebsd.org/msg38637.html My problem, I should have not mixed different changes into the same patch, for clarity. I should have provided two patches: one that adds smp_rv_waiters[3] and its handling and one that "removes" smp_rv_waiters[0]. I would to see my proposed patch actually tested, if possible, before it's dismissed :-) -- Andriy Gapon
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4DCF7C55.3030404>