Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 17 May 2018 12:19:57 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Andriy Gapon <avg@FreeBSD.org>
Cc:        Johannes Lundberg <johalun0@gmail.com>, freebsd-current <freebsd-current@freebsd.org>
Subject:   Re: Lag after resume culprit found
Message-ID:  <20180517091957.GF6887@kib.kiev.ua>
In-Reply-To: <4d69b9f6-9406-74ba-1780-ac783adcf107@FreeBSD.org>
References:  <CAECmPwtULDe9GGK0PhnUa7_n=zxripJj9nh5m0RTF9XqKhXKYQ@mail.gmail.com> <acaa419d-891e-96b1-7c1f-3203857c07ec@FreeBSD.org> <CAECmPwsgQhMM6zu=EfV=DQ4VHzEMuQUjD%2B45O-TP=A2U9mM8Qg@mail.gmail.com> <CAECmPwuKoQaD0M-wJagns_YCDMLy_qMnuy%2BceLF5UZtfE_1ehg@mail.gmail.com> <4d69b9f6-9406-74ba-1780-ac783adcf107@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, May 17, 2018 at 11:06:42AM +0300, Andriy Gapon wrote:
> On 17/05/2018 10:56, Johannes Lundberg wrote:
> > 
> > 
> > On Thu, May 17, 2018 at 8:46 AM, Johannes Lundberg <johalun0@gmail.com
> > <mailto:johalun0@gmail.com>> wrote:
> > 
> > 
> > 
> >     On Thu, May 17, 2018 at 7:43 AM, Andriy Gapon <avg@freebsd.org
> >     <mailto:avg@freebsd.org>> wrote:
> > 
> >         On 17/05/2018 02:07, Johannes Lundberg wrote:
> >         > https://github.com/freebsd/freebsd/commit/66f063557f257baa9c8aeab9f933171eaa6e1cfa
> >         <https://github.com/freebsd/freebsd/commit/66f063557f257baa9c8aeab9f933171eaa6e1cfa>;
> >         > x86 cpususpend_handler: call wbinvd after setting suspend state bits
> > 
> >         That's very interesting and surprising.
> >         That commit changes something that happens before suspend, it should not
> >         have
> >         any effect on the system state after resume.
> > 
> >         Does anyone have a theory of what could be wrong?
> > 
> > 
> >     Nope but moving
> >     ššš ššš CPU_CLR_ATOMIC(cpu, &suspended_cpus);
> >     back to the end of that scope fixes it.
> >     š
> > 
> > 
> > I did some further testing.
> > Calling
> > CPU_CLR_ATOMIC(cpu, &suspended_cpus);
> > before
> > pmap_init_pat();
> > šis what "breaks" resume.
> > 
> > Is this Intel only or this it happen on AMD as well (which this patch was
> > intended for)?
> 
> Not sure about the PAT part, but fpuresume/npxresume would affect all platforms.
> It's a bit puzzling that doing PAT manipulations on one AP while another AP is
> being brought up is problematic.  Probably there is something that I am missing.

Manipulating PAT might affect the cache consistency, since contradicting
caching attributes are applied to the line of the suspended_cpus variable
which is already cached.  It might be not the variable itself that causes
the final mis-operation, but some other data sharing the line.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20180517091957.GF6887>