Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 2 Nov 2016 09:18:15 -0700
From:      Jason Harmening <jason.harmening@gmail.com>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: huge nanosleep variance on 11-stable
Message-ID:  <3620f62e-0f4c-2d62-dcf8-e2fdff459250@gmail.com>
In-Reply-To: <20161102075509.GF54029@kib.kiev.ua>
References:  <c88341e2-4c52-ed3c-a469-6446da4415f4@gmail.com> <6167392c-c37a-6e39-aa22-ca45435d6088@gmail.com> <20161102075509.GF54029@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--E2nnDG53p2ht9ac2Fldin6Avh8wNJekfL
Content-Type: multipart/mixed; boundary="2ni9r6VQ6UvSBsCuSqjvgPAVDq1SP7gs6";
 protected-headers="v1"
From: Jason Harmening <jason.harmening@gmail.com>
To: Konstantin Belousov <kostikbel@gmail.com>
Cc: freebsd-stable@freebsd.org
Message-ID: <3620f62e-0f4c-2d62-dcf8-e2fdff459250@gmail.com>
Subject: Re: huge nanosleep variance on 11-stable
References: <c88341e2-4c52-ed3c-a469-6446da4415f4@gmail.com>
 <6167392c-c37a-6e39-aa22-ca45435d6088@gmail.com>
 <20161102075509.GF54029@kib.kiev.ua>
In-Reply-To: <20161102075509.GF54029@kib.kiev.ua>

--2ni9r6VQ6UvSBsCuSqjvgPAVDq1SP7gs6
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable



On 11/02/16 00:55, Konstantin Belousov wrote:
> On Tue, Nov 01, 2016 at 02:29:13PM -0700, Jason Harmening wrote:
>> repro code is at http://pastebin.com/B68N4AFY if anyone's interested.
>>
>> On 11/01/16 13:58, Jason Harmening wrote:
>>> Hi everyone,
>>>
>>> I recently upgraded my main amd64 server from 10.3-stable (r302011) t=
o
>>> 11.0-stable (r308099).  It went smoothly except for one big issue:
>>> certain applications (but not the system as a whole) respond very
>>> sluggishly, and video playback of any kind is extremely choppy.
>>>
>>> The system is under very light load, and I see no evidence of abnorma=
l
>>> interrupt latency or interrupt load.  More interestingly, if I place =
the
>>> system under full load (~0.0% idle) the problem *disappears* and
>>> playback/responsiveness are smooth and quick.
>>>
>>> Running ktrace on some of the affected apps points me at the problem:=

>>> huge variance in the amount of time spent in the nanosleep system cal=
l.
>>> A sleep of, say, 5ms might take anywhere from 5ms to ~500ms from entr=
y
>>> to return of the syscall.  OTOH, anything CPU-bound or that waits on
>>> condvars or I/O interrupts seems to work fine, so this doesn't seem t=
o
>>> be an issue with overall system latency.
>>>
>>> I can repro this with a simple program that just does a 3ms usleep in=
 a
>>> tight loop (i.e. roughly the amount of time a video player would slee=
p
>>> between frames @ 30fps).  At light load ktrace will show the huge
>>> nanosleep variance; under heavy load every nanosleep will complete in=

>>> almost exactly 3ms.
>>>
>>> FWIW, I don't see this on -current, although right now all my -curren=
t
>>> images are VMs on different HW so that might not mean anything.  I'm =
not
>>> aware of any recent timer- or scheduler- specific changes, so I'm
>>> wondering if perhaps the recent IPI or taskqueue changes might be
>>> somehow to blame.
>>>
>>> I'm not especially familiar w/ the relevant parts of the kernel, so a=
ny
>>> guidance on where I should focus my debugging efforts would be much
>>> appreciated.
>>>
>=20
> I am confident, with very high degree of certainity, that the issue is =
a
> CPU bug in interaction between deep sleep states (C6) and LAPIC timer.
> Check what hardware is used for the eventtimers,
> 	sysctl kern.eventtimer.timer
> It should report LAPIC, and you should get rid of jitter with setting
> the sysctl to HPET.  Also please show the first 50 lines of the verbose=

> boot dmesg.
>=20
> I know that the Nehalem cores are affected, I do not know was the bug
> fixed for Westmere or not.  I asked Intel contact about the problem,
> but got no response.  It is not unreasonable, given that the CPUs are
> beyond their support time.  I intended to automatically bump HPET quali=
ty
> on Nehalem and might be Westmere, but I was not able to check Westmere,=

> and waited for more information, so this was forgotten.
> BTW, using the latest CPU microcode did not helped.
>=20
> After I discovered this, I specifically looked at my Sandy and Haswell
> test systems, but they do not exhibit such problem.
>=20
> In the Intel document 320836-036US 'Intel(R) CoreTM i7-900 Desktop
> Processor Extreme Edition Series and Intel(R) CoreTM i7-900 Desktop
> Processor Series Specification Update', there are two erratas which
> might be relevant and show the LAPIC bugs: AAJ47 (but default is to
> not use periodic mode), and AAJ121.  The 121 might be the real cause,
> but Intel does not provide enough details to understand.  And of
> course, the suggested workaround is not feasible.
>=20
> Googling for 'Windows LAPIC Nehalem' shows very interesting results,
> in particular,
> https://support.microsoft.com/en-us/kb/2000977 (which I think is the bu=
g
> you see) and
> https://hardware.slashdot.org/story/09/11/28/1723257/microsoft-advice-a=
gainst-nehalem-xeons-snuffed-out
> for amusement.
>=20

I think you are probably right.  Hacking out the Intel-specific
additions to C-state parsing in acpi_cpu_cx_cst() from r282678 (thus
going back to sti;hlt instead of monitor+mwait at C1) fixed the problem
for me.  But r282678 also had the effect of enabling C2 and C3 on my
system, because ACPI only presents MWAIT entries for those states and
not p_lvlx.

I will try switching to HPET when I have more time to test; may be a few
days.



--2ni9r6VQ6UvSBsCuSqjvgPAVDq1SP7gs6--

--E2nnDG53p2ht9ac2Fldin6Avh8wNJekfL
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----

iQF8BAEBCgBmBQJYGhHIXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRENkY3MTQyREU0MTU4MTgyRkZDNUU2ODVC
QjlGOEJGOTkyODQxRDFCAAoJELufi/mShB0borAH/1ZK64fGpBw6Y4QiMG1Vs/3q
7AecQZzWuf9VK9Z5V8iZYLgxud6fS2ZZZdiFGoPladfpg/I7CN3NXh5YfOjuWHfr
RLwAOyWGpAaxzCcA09o8h+3x5sAq1NM6v6xi1WtKo8mHFVKanymJDiAjRIqdyD7A
pQpvyfADFhFw2148t/kwhtJgsMDCfrW9lR+aCsfYJ/qrZrc+yMtvJq76mUNcQEZf
Qms+t5FDBF4LJP62r72wHplUm1jckMtAOs9grVGhflHVXWbCKdr3e2I1Gh23MkOR
vIduGdhpNIqesIRsPhCS2sWZ6kiDVUJ92gvZfONjdonY5D087YxahTGUiX6itw8=
=aB6/
-----END PGP SIGNATURE-----

--E2nnDG53p2ht9ac2Fldin6Avh8wNJekfL--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3620f62e-0f4c-2d62-dcf8-e2fdff459250>