From owner-freebsd-hackers@freebsd.org  Wed Oct 28 16:50:13 2015
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id A38D1A1FFE0
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Wed, 28 Oct 2015 16:50:13 +0000 (UTC) (envelope-from avg@FreeBSD.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
 by mx1.freebsd.org (Postfix) with ESMTP id 3E8A11B5E;
 Wed, 28 Oct 2015 16:50:12 +0000 (UTC) (envelope-from avg@FreeBSD.org)
Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua
 [212.40.38.100])
 by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id SAA10967;
 Wed, 28 Oct 2015 18:50:04 +0200 (EET) (envelope-from avg@FreeBSD.org)
Received: from localhost ([127.0.0.1])
 by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
 id 1ZrTvI-000IvY-18; Wed, 28 Oct 2015 18:50:04 +0200
Subject: Re: instability of timekeeping
To: Konstantin Belousov <kostikbel@gmail.com>,
 Alexander Motin <mav@FreeBSD.org>
References: <56261398.60102@FreeBSD.org> <56261FE6.90302@FreeBSD.org>
 <56274FFC.2000608@FreeBSD.org> <20151021184850.GX2257@kib.kiev.ua>
 <562F3E2F.2010100@FreeBSD.org> <20151027115810.GU2257@kib.kiev.ua>
 <562F8109.4050203@FreeBSD.org> <20151027140403.GB2257@kib.kiev.ua>
Cc: freebsd-hackers <freebsd-hackers@FreeBSD.org>,
 Poul-Henning Kamp <phk@phk.freebsd.dk>, Jung-uk Kim <jkim@FreeBSD.org>
From: Andriy Gapon <avg@FreeBSD.org>
X-Enigmail-Draft-Status: N1110
Message-ID: <5630FC3B.2070908@FreeBSD.org>
Date: Wed, 28 Oct 2015 18:47:55 +0200
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:38.0) Gecko/20100101
 Thunderbird/38.3.0
MIME-Version: 1.0
In-Reply-To: <20151027140403.GB2257@kib.kiev.ua>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Oct 2015 16:50:13 -0000

On 27/10/2015 16:04, Konstantin Belousov wrote:
> On Tue, Oct 27, 2015 at 03:50:01PM +0200, Andriy Gapon wrote:
>> On 27/10/2015 13:58, Konstantin Belousov wrote:
>>> On Tue, Oct 27, 2015 at 11:04:47AM +0200, Andriy Gapon wrote:
>>>> And now another observation. I have C1E option enabled in BIOS. It
>>>> means that if all cores enter C1 state, then the whole processor
>>>> is magically placed into a deep C-state (C3, I think). LAPIC timer
>>>> on this CPU model does not run in the deep C-state. So, I had to
>>>> disable C1E option to test the LAPIC timer in a useful way. But
>>>> before actually testing it I first tried to reproduce the problem. As
>>>> you might have already guessed the problem is gone with that option
>>>> disabled. Scratching my head to understand the implications of this
>>>> observation.
>>>
>>> Most obvious explanation would be that the latency of wakeup is very large.
>>> What is the HPET frequency when the jitter occur ?
>>>
>>
>> kern.timecounter.tc.TSC-low.frequency: 1607351869
>> kern.eventtimer.et.HPET.frequency: 14318180
>>
>> Or did you mean the actual rate of timer interrupts?
> 
> Actual rate, and I used the wrong word.  Since most likely your eventtimer
> is not periodic, I mean the next timer interrupt deadline.
> 

I wasn't sure how to answer your question, so I went ahead and obtained a little
bit of debugging information using KTR.
For a start, here is a patch that adds couple more trace points:
https://people.freebsd.org/~avg/timekeeping-ktr.patch
Please note that the patch has a small bug in hardclock_cnt():
+			CTR1(KTR_SPARE2, "hardclock_cnt at %d: newticks=%d",
+			    newticks);
Two format specifiers but only one argument.  I mention this, so that the trace
is not misinterpreted.

Then, here is a snippet of the trace that seems to be of interest:
https://people.freebsd.org/~avg/timekeeping.ktrdump.txt
I must admit that I have a hard time interpreting what that trace says.
As I understand the KTR timestamps are raw TSC values. So, there is a huge gap
between entries #999543 and #999544.  It's larger than 2 * 2^32, so even after
applying tsc_shift it's still larger that 2^32.

I am not sure how the HPET timer could get programmed for such a long delay.
Or maybe there is some lower level problem that caused the interrupt to be
delayed by (16450821032024 - 16442184278422) / 2 / 1607351869 ≈ 2.687 seconds.
That's quite a long time.

One thing that I can tell is that my earlier hypothesis seems to be wrong as all
raw TSC values are increasing.

Grepping for lines where the timer gets reprogrammed I was surprised to see the
following:
999543   1   16442184278422 load at 1:    next 5042.d25692b3 eq 0
999507   3   16442183843960 load at 3:    next 5042.d23e2146 eq 0
999503   2   16442183825488 load at 2:    next 5042.d25692b3 eq 0
999472   0   16442183792870 load at 0:    next 5042.d23e2146 eq 0
This looks like the next event time has flipped between those two values somehow.

Also, the following seemed a little bit strange:
999508   3   16442183860624 ipi  at 3:    now  5042.d2301033
999467   3   16442183787851 active at 3:  now  5042.d2306108
The later even reports a value of 'now' that is smaller.
I think that that can be explained by the fact that in one case the value is
obtained via sbinuptime() while in the other case it's taken from state->now,
which is set when an actual timer interrupt is received (before sending a
hardclock IPI).

In either case I am going to add a few more trace points in et_start and the
HPET timer code and see if I can catch anything interesting there.

-- 
Andriy Gapon