Date: Thu, 17 Jan 2013 13:32:15 +0200 From: Daniel Braniss <danny@cs.huji.ac.il> To: "Ronald Klop" <ronald-freebsd8@klop.yi.org> Cc: freebsd-stable@freebsd.org Subject: Re: time issues and some more Message-ID: <E1Tvnhf-00064g-8K@kabab.cs.huji.ac.il> In-Reply-To: <op.wq1vjkyv8527sy@ronaldradial.versatec.local> References: <E1TvPZ7-000NC7-5C@kabab.cs.huji.ac.il> <op.wq0mrtuy8527sy@212-182-167-131.ip.telfort.nl> <E1TvlIV-00013s-Rz@kabab.cs.huji.ac.il> <op.wq1vjkyv8527sy@ronaldradial.versatec.local>
next in thread | previous in thread | raw e-mail | index | archive | help
> On Thu, 17 Jan 2013 09:58:07 +0100, Daniel Braniss <danny@cs.huji.ac.il> > wrote: > > >> On Wed, 16 Jan 2013 10:45:49 +0100, Daniel Braniss <danny@cs.huji.ac.il> > >> wrote: > >> > >> > I resently upgraded a Dell PowerEdge R710, to 9.1-stable, we mainly > >> use > >> > it as > >> > a backup to several zfs servers (doing send|receive) without major > >> > issues till > >> > the upgrade, it was running 8.2-stable. > >> > > >> > now, we see that sometime the time drifts, and today I noticed that it > >> > was > >> > hung, and once I got unto the ipmi console this is what i got: > >> > [SOL Session operational. Use ~? for help] > >> > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3864, size: > >> 12288 > >> > > >> > and things started moving again, > >> > > >> > in /var/log/messages: > >> > Jan 16 03:27:35 store-02 kernel: swap_pager: indefinite wait buffer: > >> > bufobj: > >> > 0, blkno: 3864, size: 12288 > >> > > >> > but the REAL time is 7hs ahead!, so time stood still ? > >> > and now, of course we get: > >> > Jan 16 03:54:19 store-02 ntpd[38163]: time correction of 25216 seconds > >> > exceeds > >> > sanity limit (1000); set clock manually to the correct UTC time. > >> > > >> > I will now reboot, and try a newer kernel and check, but any insight > >> will > >> > be very helpful, > >> > > >> > thanks, > >> > danny > >> > >> Does BSD 9 choose another timer source than BSD 8? > >> Use sysctl to check these values at your system. > >> kern.eventtimer.choice: LAPIC(400) i8254(100) RTC(0) > >> kern.eventtimer.timer: LAPIC > >> > >> Or this ones. I always confuse these. > >> kern.timecounter.choice: TSC-low(1000) ACPI-fast(900) i8254(0) > >> dummy(-1000000) > >> kern.timecounter.hardware: TSC-low > >> > > > > under 8.3 it's kern.timecounte, so this is what I get: > > > >> sysctl kern.timecounter > > kern.timecounter.tick: 1 > > kern.timecounter.choice: TSC(-100) HPET(900) ACPI-fast(1000) i8254(0) > > dummy(-1000000) > > kern.timecounter.hardware: ACPI-fast > > kern.timecounter.stepwarnings: 0 > > kern.timecounter.tc.i8254.mask: 65535 > > kern.timecounter.tc.i8254.counter: 52515 > > kern.timecounter.tc.i8254.frequency: 1193182 > > kern.timecounter.tc.i8254.quality: 0 > > kern.timecounter.tc.ACPI-fast.mask: 16777215 > > kern.timecounter.tc.ACPI-fast.counter: 925448 > > kern.timecounter.tc.ACPI-fast.frequency: 3579545 > > kern.timecounter.tc.ACPI-fast.quality: 1000 > > kern.timecounter.tc.HPET.mask: 4294967295 > > kern.timecounter.tc.HPET.counter: 1472869277 > > kern.timecounter.tc.HPET.frequency: 14318180 > > kern.timecounter.tc.HPET.quality: 900 > > kern.timecounter.tc.TSC.mask: 4294967295 > > kern.timecounter.tc.TSC.counter: 4125922088 > > kern.timecounter.tc.TSC.frequency: 2329838875 > > kern.timecounter.tc.TSC.quality: -100 > > kern.timecounter.smp_tsc: 0 > > kern.timecounter.invariant_tsc: 1 > > > > so I assume the choise is HPET, under 9.1: > > kern.eventtimer.timer: HPET > > Your servers uses: > > kern.timecounter.hardware: ACPI-fast > > Please check that value on 9.1 and 8.3. > they both choose the same, ACPI-fast > > > > > so it seems to be the same. > > > > btw, this morning I see that it's behind more than 1 hour, and no signs > > of > > ntpd! > > > > the logs show: > > ... > > Jan 17 00:40:52 store-02 kernel: usb_dev_suspend_peer: Setting device > > remote > > wakeup failed > > Jan 17 01:05:46 store-02 ntpd[1845]: time correction of 7854 seconds > > exceeds > > sanity limit (1000); set clock manually to the correct UTC time. > > ... > > > > it seems to me that the 7854 seconds is exactly the time diff: > > date on this hosts says: > > Thu Jan 17 08:46:18 IST 2013 > > > > > > addig the 7854 sec is the current(almost) real date: > > Thu Jan 17 10:57:13 IST 2013 > > > > something is very fishy here. > > > Are you doing suspend/resume stuff on your machine? Or does > usb_dev_suspend_peer mean suspend in another way? not that I know, but the prev. time it complained about something else: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3864, size: 12288 Since I have other such boxes -without the problem-, my bet is on mfdi/zfs danny
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E1Tvnhf-00064g-8K>