From owner-freebsd-stable@FreeBSD.ORG Thu Oct 11 12:54:50 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 98081E6B for ; Thu, 11 Oct 2012 12:54:50 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-lb0-f182.google.com (mail-lb0-f182.google.com [209.85.217.182]) by mx1.freebsd.org (Postfix) with ESMTP id 1199C8FC12 for ; Thu, 11 Oct 2012 12:54:49 +0000 (UTC) Received: by mail-lb0-f182.google.com with SMTP id b5so1530903lbd.13 for ; Thu, 11 Oct 2012 05:54:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:message-id:date:from:user-agent:mime-version:to:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=+0oNKeVEd63PILWwjCAUj1beT2D+zhY95mofAzlX/0A=; b=NQJzs6HQQb4V4hKHrxHJmv4lqBZhDnz6u9CA8HxTjYkTosjvYC0fguGajvpiORuO5n pFBBKmN/PqIfqrc0J42SXHMkuAJzOCnHX9HTnIx/jAuxqG0fPyXQlnenTWR2mPk9cwMJ WE7Du5N1oHEQrGxSvBCPXqvr7bJ60KgFztZs96iMedomsytkGSrm40GAAaNS23r3u5BK oiAwLaaMTHJqqDDvfXTsSL3JUj7tmz1oQaaOFnruBK850HzPE3+v7LdZUE5IhxpXHSpK qQjFo8CmtQvOws/XnKsnBMdDnKMTcTZWRfGcKVNJkhW+DVtDVrGMZq3FOdVrbI9wX8Oz JWDA== Received: by 10.152.124.83 with SMTP id mg19mr795187lab.6.1349960087946; Thu, 11 Oct 2012 05:54:47 -0700 (PDT) Received: from mavbook.mavhome.dp.ua (mavhome.mavhome.dp.ua. [213.227.240.37]) by mx.google.com with ESMTPS id oj5sm1328898lab.8.2012.10.11.05.54.45 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 11 Oct 2012 05:54:46 -0700 (PDT) Sender: Alexander Motin Message-ID: <5076C193.70405@FreeBSD.org> Date: Thu, 11 Oct 2012 15:54:43 +0300 From: Alexander Motin User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:13.0) Gecko/20120628 Thunderbird/13.0.1 MIME-Version: 1.0 To: freebsd-stable@FreeBSD.org Subject: Re: time keeps on slipping... slipping... References: <20121008040239.GE1967@funkthat.com> <5075F9F7.1040007@FreeBSD.org> <20121011063030.GK1967@funkthat.com> In-Reply-To: <20121011063030.GK1967@funkthat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Oct 2012 12:54:50 -0000 On 11.10.2012 09:30, John-Mark Gurney wrote: > Alexander Motin wrote this message on Thu, Oct 11, 2012 at 01:43 +0300: >> On 08.10.2012 07:02, John-Mark Gurney wrote: >>> I recently put together a new machine w/ a SuperMicro H8SCM and an >>> AMD Opteron 4228 HE... I've having an issue where the clock on the >>> machine skips around... The wierd part is that it's very sudden when >>> it happens... ntp sometimes brings it back, but it can't when the clock >>> gets too far ahread (1000 seconds), ntp dies... >>> >>> In order to catch it happening, I ran a sleep 60 loop fetching time >> >from another server that keeps time correctly via: >>> while sleep 60; do echo -n h2:; nc h2 13; date; ntpdate h2.funkthat.com; >>> done >>> >>> here are some snippits: >>> h2:Sun Oct 7 17:12:54 2012^M >>> Sun Oct 7 17:12:54 PDT 2012 >>> 7 Oct 17:12:54 ntpdate[31036]: the NTP socket is in use, exiting >>> h2:Sun Oct 7 17:13:48 2012^M >>> Sun Oct 7 17:20:21 PDT 2012 >>> 7 Oct 17:20:21 ntpdate[31045]: the NTP socket is in use, exiting >>> >>> but then ntp brings it back in sync: >>> h2:Sun Oct 7 17:28:49 2012^M >>> Sun Oct 7 17:35:21 PDT 2012 >>> 7 Oct 17:35:21 ntpdate[31164]: the NTP socket is in use, exiting >>> h2:Sun Oct 7 17:29:49 2012^M >>> Sun Oct 7 17:29:49 PDT 2012 >>> 7 Oct 17:29:49 ntpdate[31170]: the NTP socket is in use, exiting >>> >>> It happens pretty often: >>> Oct 7 00:19:13 gold ntpd[3721]: time reset -785.347912 s >>> Oct 7 00:46:37 gold ntpd[3721]: time reset -392.673256 s >>> Oct 7 01:04:24 gold ntpd[3721]: time reset -785.346533 s >>> Oct 7 15:00:59 gold ntpd[3721]: time reset -392.681720 s >>> Oct 7 16:32:11 gold ntpd[3721]: time reset -392.671268 s >>> Oct 7 17:29:29 gold ntpd[3721]: time reset -392.671752 s >>> Oct 7 18:04:37 gold ntpd[3721]: time reset -785.346987 s >>> >>> but as you can see above, the time slip happens abruptly.. looks like >>> a rounding error or something... >>> >>> I'm now reducing the sleep to 5 seconds... but as you can see the sleep >>> ends a few seconds early and local time suddenly jumped forward 6 >>> minutes 33 seconds... >>> >>> $ sysctl kern.timecounter >>> kern.timecounter.fast_gettime: 1 >>> kern.timecounter.tick: 1 >>> kern.timecounter.choice: TSC-low(1000) ACPI-safe(850) HPET(950) i8254(0) >>> dummy(-1000000) >>> kern.timecounter.hardware: TSC-low >>> kern.timecounter.stepwarnings: 0 >>> kern.timecounter.tc.i8254.mask: 65535 >>> kern.timecounter.tc.i8254.counter: 11598 >>> kern.timecounter.tc.i8254.frequency: 1193182 >>> kern.timecounter.tc.i8254.quality: 0 >>> kern.timecounter.tc.HPET.mask: 4294967295 >>> kern.timecounter.tc.HPET.counter: 3257069245 >>> kern.timecounter.tc.HPET.frequency: 14318180 >>> kern.timecounter.tc.HPET.quality: 950 >>> kern.timecounter.tc.ACPI-safe.mask: 16777215 >>> kern.timecounter.tc.ACPI-safe.counter: 4219134510 >>> kern.timecounter.tc.ACPI-safe.frequency: 3579545 >>> kern.timecounter.tc.ACPI-safe.quality: 850 >>> kern.timecounter.tc.TSC-low.mask: 4294967295 >>> kern.timecounter.tc.TSC-low.counter: 2854866610 >>> kern.timecounter.tc.TSC-low.frequency: 10937740 >>> kern.timecounter.tc.TSC-low.quality: 1000 >>> kern.timecounter.smp_tsc: 1 >>> kern.timecounter.invariant_tsc: 1 >>> $ sysctl kern.eventtimer >>> kern.eventtimer.choice: LAPIC(400) i8254(100) RTC(0) >>> kern.eventtimer.et.LAPIC.flags: 15 >>> kern.eventtimer.et.LAPIC.frequency: 100002217 >>> kern.eventtimer.et.LAPIC.quality: 400 >>> kern.eventtimer.et.i8254.flags: 1 >>> kern.eventtimer.et.i8254.frequency: 1193182 >>> kern.eventtimer.et.i8254.quality: 100 >>> kern.eventtimer.et.RTC.flags: 17 >>> kern.eventtimer.et.RTC.frequency: 32768 >>> kern.eventtimer.et.RTC.quality: 0 >>> kern.eventtimer.periodic: 0 >>> kern.eventtimer.timer: LAPIC >>> kern.eventtimer.activetick: 1 >>> kern.eventtimer.idletick: 0 >>> kern.eventtimer.singlemul: 2 >>> >>> I have switched my timecounter to HPET to see if things are different... >>> >>> Any clues? >> >> Mentioned switching to HPET could tell a lot about the problem. >> Switching event timer also may be interesting. > > Since I switch to HPET, it hasn't happened at all in the last 3 days.. That is probably tells about some problems with TSC timecounter. What is strange to me is time jump size of 5 minutes. TSC timecounter should overflow each few seconds, so single jump should be just that big. > Should I try switching back to TSC and switching event timer? do you > need any other info, or want me to try anything else? You may try to do it to be sure eventtimers are not related to the case. > Oh, forgot to include the specific processor info in my previous > email: > CPU: AMD Opteron(tm) Processor 4228 HE (2800.05-MHz K8-class CPU) > Origin = "AuthenticAMD" Id = 0x600f12 Family = 0x15 Model = 0x1 Stepping = 2 > Features=0x178bfbff > Features2=0x1e98220b > AMD Features=0x2e500800 > AMD Features2=0x1c9bfff,> > TSC: P-state invariant, performance statistics Unfortunately, I don't know AMD processors specifics. May be jkim@ or avg@ may remember something. As far as I know, kernel should block enter sleep states on AMD CPUs when LAPIC eventtimer is used (by default). In such case I guess TSC should also work fine. But I don't know what other possible sources of asynchronicity may be there. -- Alexander Motin