Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 11 Oct 2012 15:54:43 +0300
From:      Alexander Motin <mav@FreeBSD.org>
To:        freebsd-stable@FreeBSD.org
Subject:   Re: time keeps on slipping... slipping...
Message-ID:  <5076C193.70405@FreeBSD.org>
In-Reply-To: <20121011063030.GK1967@funkthat.com>
References:  <20121008040239.GE1967@funkthat.com> <5075F9F7.1040007@FreeBSD.org> <20121011063030.GK1967@funkthat.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 11.10.2012 09:30, John-Mark Gurney wrote:
> Alexander Motin wrote this message on Thu, Oct 11, 2012 at 01:43 +0300:
>> On 08.10.2012 07:02, John-Mark Gurney wrote:
>>> I recently put together a new machine w/ a SuperMicro H8SCM and an
>>> AMD Opteron 4228 HE...  I've having an issue where the clock on the
>>> machine skips around...  The wierd part is that it's very sudden when
>>> it happens...  ntp sometimes brings it back, but it can't when the clock
>>> gets too far ahread (1000 seconds), ntp dies...
>>>
>>> In order to catch it happening, I ran a sleep 60 loop fetching time
>> >from another server that keeps time correctly via:
>>> while sleep 60; do echo -n h2:; nc h2 13; date; ntpdate h2.funkthat.com;
>>> done
>>>
>>> here are some snippits:
>>> h2:Sun Oct  7 17:12:54 2012^M
>>> Sun Oct  7 17:12:54 PDT 2012
>>>   7 Oct 17:12:54 ntpdate[31036]: the NTP socket is in use, exiting
>>> h2:Sun Oct  7 17:13:48 2012^M
>>> Sun Oct  7 17:20:21 PDT 2012
>>>   7 Oct 17:20:21 ntpdate[31045]: the NTP socket is in use, exiting
>>>
>>> but then ntp brings it back in sync:
>>> h2:Sun Oct  7 17:28:49 2012^M
>>> Sun Oct  7 17:35:21 PDT 2012
>>>   7 Oct 17:35:21 ntpdate[31164]: the NTP socket is in use, exiting
>>> h2:Sun Oct  7 17:29:49 2012^M
>>> Sun Oct  7 17:29:49 PDT 2012
>>>   7 Oct 17:29:49 ntpdate[31170]: the NTP socket is in use, exiting
>>>
>>> It happens pretty often:
>>> Oct  7 00:19:13 gold ntpd[3721]: time reset -785.347912 s
>>> Oct  7 00:46:37 gold ntpd[3721]: time reset -392.673256 s
>>> Oct  7 01:04:24 gold ntpd[3721]: time reset -785.346533 s
>>> Oct  7 15:00:59 gold ntpd[3721]: time reset -392.681720 s
>>> Oct  7 16:32:11 gold ntpd[3721]: time reset -392.671268 s
>>> Oct  7 17:29:29 gold ntpd[3721]: time reset -392.671752 s
>>> Oct  7 18:04:37 gold ntpd[3721]: time reset -785.346987 s
>>>
>>> but as you can see above, the time slip happens abruptly.. looks like
>>> a rounding error or something...
>>>
>>> I'm now reducing the sleep to 5 seconds... but as you can see the sleep
>>> ends a few seconds early and local time suddenly jumped forward 6
>>> minutes 33 seconds...
>>>
>>> $ sysctl kern.timecounter
>>> kern.timecounter.fast_gettime: 1
>>> kern.timecounter.tick: 1
>>> kern.timecounter.choice: TSC-low(1000) ACPI-safe(850) HPET(950) i8254(0)
>>> dummy(-1000000)
>>> kern.timecounter.hardware: TSC-low
>>> kern.timecounter.stepwarnings: 0
>>> kern.timecounter.tc.i8254.mask: 65535
>>> kern.timecounter.tc.i8254.counter: 11598
>>> kern.timecounter.tc.i8254.frequency: 1193182
>>> kern.timecounter.tc.i8254.quality: 0
>>> kern.timecounter.tc.HPET.mask: 4294967295
>>> kern.timecounter.tc.HPET.counter: 3257069245
>>> kern.timecounter.tc.HPET.frequency: 14318180
>>> kern.timecounter.tc.HPET.quality: 950
>>> kern.timecounter.tc.ACPI-safe.mask: 16777215
>>> kern.timecounter.tc.ACPI-safe.counter: 4219134510
>>> kern.timecounter.tc.ACPI-safe.frequency: 3579545
>>> kern.timecounter.tc.ACPI-safe.quality: 850
>>> kern.timecounter.tc.TSC-low.mask: 4294967295
>>> kern.timecounter.tc.TSC-low.counter: 2854866610
>>> kern.timecounter.tc.TSC-low.frequency: 10937740
>>> kern.timecounter.tc.TSC-low.quality: 1000
>>> kern.timecounter.smp_tsc: 1
>>> kern.timecounter.invariant_tsc: 1
>>> $ sysctl kern.eventtimer
>>> kern.eventtimer.choice: LAPIC(400) i8254(100) RTC(0)
>>> kern.eventtimer.et.LAPIC.flags: 15
>>> kern.eventtimer.et.LAPIC.frequency: 100002217
>>> kern.eventtimer.et.LAPIC.quality: 400
>>> kern.eventtimer.et.i8254.flags: 1
>>> kern.eventtimer.et.i8254.frequency: 1193182
>>> kern.eventtimer.et.i8254.quality: 100
>>> kern.eventtimer.et.RTC.flags: 17
>>> kern.eventtimer.et.RTC.frequency: 32768
>>> kern.eventtimer.et.RTC.quality: 0
>>> kern.eventtimer.periodic: 0
>>> kern.eventtimer.timer: LAPIC
>>> kern.eventtimer.activetick: 1
>>> kern.eventtimer.idletick: 0
>>> kern.eventtimer.singlemul: 2
>>>
>>> I have switched my timecounter to HPET to see if things are different...
>>>
>>> Any clues?
>>
>> Mentioned switching to HPET could tell a lot about the problem.
>> Switching event timer also may be interesting.
>
> Since I switch to HPET, it hasn't happened at all in the last 3 days..

That is probably tells about some problems with TSC timecounter. What is 
strange to me is time jump size of 5 minutes. TSC timecounter should 
overflow each few seconds, so single jump should be just that big.

> Should I try switching back to TSC and switching event timer? do you
> need any other info, or want me to try anything else?

You may try to do it to be sure eventtimers are not related to the case.

> Oh, forgot to include the specific processor info in my previous
> email:
> CPU: AMD Opteron(tm) Processor 4228 HE               (2800.05-MHz K8-class CPU)
>    Origin = "AuthenticAMD"  Id = 0x600f12  Family = 0x15  Model = 0x1  Stepping = 2
>    Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
>    Features2=0x1e98220b<SSE3,PCLMULQDQ,MON,SSSE3,CX16,SSE4.1,SSE4.2,POPCNT,AESNI,XSAVE,OSXSAVE,AVX>
>    AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM>
>    AMD Features2=0x1c9bfff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS,XOP,SKINIT,WDT,LWP,FMA4,NodeId,Topology,<b23>,<b24>>
>    TSC: P-state invariant, performance statistics

Unfortunately, I don't know AMD processors specifics. May be jkim@ or 
avg@ may remember something. As far as I know, kernel should block enter 
sleep states on AMD CPUs when LAPIC eventtimer is used (by default). In 
such case I guess TSC should also work fine. But I don't know what other 
possible sources of asynchronicity may be there.

-- 
Alexander Motin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5076C193.70405>