From owner-freebsd-stable@freebsd.org Wed Nov 2 16:11:21 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9E666C2BC1F for ; Wed, 2 Nov 2016 16:11:21 +0000 (UTC) (envelope-from jason.harmening@gmail.com) Received: from mail-pf0-x235.google.com (mail-pf0-x235.google.com [IPv6:2607:f8b0:400e:c00::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 304B11B00 for ; Wed, 2 Nov 2016 16:11:21 +0000 (UTC) (envelope-from jason.harmening@gmail.com) Received: by mail-pf0-x235.google.com with SMTP id 189so14335381pfz.3 for ; Wed, 02 Nov 2016 09:11:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to; bh=e/6e1Yf1aq/iYtfEq0ID98V84FbhFDHOHv3Wa8hdnZw=; b=xHBT/Aif1NBNubA48sjV9pKWYYp17+bn1nI83HIEp1wxj6iqXcPbSioJAaLz5JKE90 xmrq8twGi2kvPSJtslCgWs/H6W5+2iJ5LDulPa9nY9RdtzUcHA5WplKPrtjFNttNKlw1 q5X0foumX87D6W/suWz2XhzxwXFprR0e1I31sDuNCwKw6h2StQHiIwR6lB7Gu4OI84w2 6Z+i+5qfoXv8gBj6UPoaIQhzGyvKErCKOePlnvOcQ8gQlWA7z03vQFaJ9fn7GaA1KeXM eS+L3aCunpHGQEOAOnIW1QhU61XAcRO6IaZTTHwJofRbODba5Rxphp7/8A8I0xL3sx5H X2ow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to; bh=e/6e1Yf1aq/iYtfEq0ID98V84FbhFDHOHv3Wa8hdnZw=; b=khySFun/FWqCVAThMo6gCDYCCkglKRRBYMfhMBBX+tZa6sm9NNpLWPpp060Sk++rDp 0kz4HO4bvPzxlgS/VJymZicEiqb8TIh8Q/MolK0qZFOOm9DURH8YRBt7cFeYq7tnGKlX DnEgSjEwQT/2hM/VdQu1auborBiuJFi9GTqET00CTzDV+Lkl/8MWDi75e60+fg0p/1g3 mpxUH1KDgiNvZ1XJUTFncFGOuqJnNqj5DA2ywLEb9qlU261gWJ+J8P0BbWMAYDkEQzrH +EZWzb0OGRud+A9NrXoZwWnI+3mEmhGWmcRRHiHHlhV3Ffl82rl8rRBsveUNYOqAJR92 LKWA== X-Gm-Message-State: ABUngvfug86yAWJomYNVTTE8TCgY1iaI9NCbVvynzYyTjdxDHrHjr0anUI2dJO+bkpzdOQ== X-Received: by 10.99.9.129 with SMTP id 123mr6769059pgj.84.1478103080504; Wed, 02 Nov 2016 09:11:20 -0700 (PDT) Received: from corona.austin.rr.com (c-67-188-30-11.hsd1.ca.comcast.net. [67.188.30.11]) by smtp.googlemail.com with ESMTPSA id p14sm5861505par.25.2016.11.02.09.11.18 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 02 Nov 2016 09:11:19 -0700 (PDT) Subject: Re: huge nanosleep variance on 11-stable To: Konstantin Belousov References: <6167392c-c37a-6e39-aa22-ca45435d6088@gmail.com> <20161102075509.GF54029@kib.kiev.ua> Cc: freebsd-stable@freebsd.org From: Jason Harmening Message-ID: <3620f62e-0f4c-2d62-dcf8-e2fdff459250@gmail.com> Date: Wed, 2 Nov 2016 09:18:15 -0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <20161102075509.GF54029@kib.kiev.ua> Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="E2nnDG53p2ht9ac2Fldin6Avh8wNJekfL" X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Nov 2016 16:11:21 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --E2nnDG53p2ht9ac2Fldin6Avh8wNJekfL Content-Type: multipart/mixed; boundary="2ni9r6VQ6UvSBsCuSqjvgPAVDq1SP7gs6"; protected-headers="v1" From: Jason Harmening To: Konstantin Belousov Cc: freebsd-stable@freebsd.org Message-ID: <3620f62e-0f4c-2d62-dcf8-e2fdff459250@gmail.com> Subject: Re: huge nanosleep variance on 11-stable References: <6167392c-c37a-6e39-aa22-ca45435d6088@gmail.com> <20161102075509.GF54029@kib.kiev.ua> In-Reply-To: <20161102075509.GF54029@kib.kiev.ua> --2ni9r6VQ6UvSBsCuSqjvgPAVDq1SP7gs6 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable On 11/02/16 00:55, Konstantin Belousov wrote: > On Tue, Nov 01, 2016 at 02:29:13PM -0700, Jason Harmening wrote: >> repro code is at http://pastebin.com/B68N4AFY if anyone's interested. >> >> On 11/01/16 13:58, Jason Harmening wrote: >>> Hi everyone, >>> >>> I recently upgraded my main amd64 server from 10.3-stable (r302011) t= o >>> 11.0-stable (r308099). It went smoothly except for one big issue: >>> certain applications (but not the system as a whole) respond very >>> sluggishly, and video playback of any kind is extremely choppy. >>> >>> The system is under very light load, and I see no evidence of abnorma= l >>> interrupt latency or interrupt load. More interestingly, if I place = the >>> system under full load (~0.0% idle) the problem *disappears* and >>> playback/responsiveness are smooth and quick. >>> >>> Running ktrace on some of the affected apps points me at the problem:= >>> huge variance in the amount of time spent in the nanosleep system cal= l. >>> A sleep of, say, 5ms might take anywhere from 5ms to ~500ms from entr= y >>> to return of the syscall. OTOH, anything CPU-bound or that waits on >>> condvars or I/O interrupts seems to work fine, so this doesn't seem t= o >>> be an issue with overall system latency. >>> >>> I can repro this with a simple program that just does a 3ms usleep in= a >>> tight loop (i.e. roughly the amount of time a video player would slee= p >>> between frames @ 30fps). At light load ktrace will show the huge >>> nanosleep variance; under heavy load every nanosleep will complete in= >>> almost exactly 3ms. >>> >>> FWIW, I don't see this on -current, although right now all my -curren= t >>> images are VMs on different HW so that might not mean anything. I'm = not >>> aware of any recent timer- or scheduler- specific changes, so I'm >>> wondering if perhaps the recent IPI or taskqueue changes might be >>> somehow to blame. >>> >>> I'm not especially familiar w/ the relevant parts of the kernel, so a= ny >>> guidance on where I should focus my debugging efforts would be much >>> appreciated. >>> >=20 > I am confident, with very high degree of certainity, that the issue is = a > CPU bug in interaction between deep sleep states (C6) and LAPIC timer. > Check what hardware is used for the eventtimers, > sysctl kern.eventtimer.timer > It should report LAPIC, and you should get rid of jitter with setting > the sysctl to HPET. Also please show the first 50 lines of the verbose= > boot dmesg. >=20 > I know that the Nehalem cores are affected, I do not know was the bug > fixed for Westmere or not. I asked Intel contact about the problem, > but got no response. It is not unreasonable, given that the CPUs are > beyond their support time. I intended to automatically bump HPET quali= ty > on Nehalem and might be Westmere, but I was not able to check Westmere,= > and waited for more information, so this was forgotten. > BTW, using the latest CPU microcode did not helped. >=20 > After I discovered this, I specifically looked at my Sandy and Haswell > test systems, but they do not exhibit such problem. >=20 > In the Intel document 320836-036US 'Intel(R) CoreTM i7-900 Desktop > Processor Extreme Edition Series and Intel(R) CoreTM i7-900 Desktop > Processor Series Specification Update', there are two erratas which > might be relevant and show the LAPIC bugs: AAJ47 (but default is to > not use periodic mode), and AAJ121. The 121 might be the real cause, > but Intel does not provide enough details to understand. And of > course, the suggested workaround is not feasible. >=20 > Googling for 'Windows LAPIC Nehalem' shows very interesting results, > in particular, > https://support.microsoft.com/en-us/kb/2000977 (which I think is the bu= g > you see) and > https://hardware.slashdot.org/story/09/11/28/1723257/microsoft-advice-a= gainst-nehalem-xeons-snuffed-out > for amusement. >=20 I think you are probably right. Hacking out the Intel-specific additions to C-state parsing in acpi_cpu_cx_cst() from r282678 (thus going back to sti;hlt instead of monitor+mwait at C1) fixed the problem for me. But r282678 also had the effect of enabling C2 and C3 on my system, because ACPI only presents MWAIT entries for those states and not p_lvlx. I will try switching to HPET when I have more time to test; may be a few days. --2ni9r6VQ6UvSBsCuSqjvgPAVDq1SP7gs6-- --E2nnDG53p2ht9ac2Fldin6Avh8wNJekfL Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- iQF8BAEBCgBmBQJYGhHIXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRENkY3MTQyREU0MTU4MTgyRkZDNUU2ODVC QjlGOEJGOTkyODQxRDFCAAoJELufi/mShB0borAH/1ZK64fGpBw6Y4QiMG1Vs/3q 7AecQZzWuf9VK9Z5V8iZYLgxud6fS2ZZZdiFGoPladfpg/I7CN3NXh5YfOjuWHfr RLwAOyWGpAaxzCcA09o8h+3x5sAq1NM6v6xi1WtKo8mHFVKanymJDiAjRIqdyD7A pQpvyfADFhFw2148t/kwhtJgsMDCfrW9lR+aCsfYJ/qrZrc+yMtvJq76mUNcQEZf Qms+t5FDBF4LJP62r72wHplUm1jckMtAOs9grVGhflHVXWbCKdr3e2I1Gh23MkOR vIduGdhpNIqesIRsPhCS2sWZ6kiDVUJ92gvZfONjdonY5D087YxahTGUiX6itw8= =aB6/ -----END PGP SIGNATURE----- --E2nnDG53p2ht9ac2Fldin6Avh8wNJekfL--