Date: Thu, 12 May 2011 06:55:58 -0400 From: John Baldwin <jhb@FreeBSD.org> To: Stanislav Sedov <stas@FreeBSD.org>, neel@FreeBSD.org Cc: svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org, Jung-uk Kim <jkim@FreeBSD.org>, Andriy Gapon <avg@FreeBSD.org> Subject: Re: svn commit: r221703 - in head/sys: amd64/include i386/include x86/isa x86/x86 Message-ID: <4DCBBCBE.5020004@FreeBSD.org> In-Reply-To: <20110512035522.e42b379c.stas@FreeBSD.org> References: <201105091734.p49HY0P3006180@svn.freebsd.org> <20110512024956.996cd973.stas@FreeBSD.org> <4DCBB9EE.8070809@FreeBSD.org> <20110512035522.e42b379c.stas@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 5/12/11 6:55 AM, Stanislav Sedov wrote: > On Thu, 12 May 2011 13:43:58 +0300 > Andriy Gapon<avg@FreeBSD.org> mentioned: > >> >> Theory: >> - smp_rv_waiters[2] becomes equal to smp_rv_ncpus >> - [at least] one slave CPU is still in the last call to cpu_spinwait() in >> smp_rendezvous_action() >> - master CPU notices that the condition is true, exits smp_rendezvous_cpus() and >> calls it again >> - the slave CPU is still in spinwait >> - the master CPU resets smp_rv_waiters[2] to zero >> - the slave CPU exits spinwait, see smp_rv_waiters[2] with zero value >> - endless loop >> > > That might explain it. > Do you have a patch for me to try? > > Thanks! > The NetApp folks working on BHyVe also ran into this. They have a fix that I think sounds reasonable which is to add a generation count to the smp rendezvous "structure" and have waiting CPUs stop waiting if the generation count changes. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4DCBBCBE.5020004>