Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 12 Feb 2018 17:29:50 -0700
From:      Alan Somers <asomers@freebsd.org>
Cc:        Mike Pumford <michaelp@bsquare.com>, FreeBSD <freebsd-stable@freebsd.org>
Subject:   Re: Clock occasionally jumps backwards on 11.1-RELEASE
Message-ID:  <CAOtMX2jXk40O2jTSwmJQ-f=mrREzvc7jjZ4=mxBRDavwbxG5mg@mail.gmail.com>
In-Reply-To: <CAOtMX2hNxRYR0Q3WxhM64wj5bNq2-wcOoSTxn_wixMiTQPseFA@mail.gmail.com>
References:  <CAOtMX2iCkurg8HXn7KD9AbrPcDVSRN-jK4MR%2BgFMAd%2BOFEdpow@mail.gmail.com> <0b170dae-b816-ea49-3516-40bfd1deaa2a@bsquare.com> <CAOtMX2hNxRYR0Q3WxhM64wj5bNq2-wcOoSTxn_wixMiTQPseFA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jan 23, 2018 at 8:40 AM, Alan Somers <asomers@freebsd.org> wrote:

> On Tue, Jan 23, 2018 at 3:48 AM, Mike Pumford <michaelp@bsquare.com>
> wrote:
>
>> On 22/01/2018 17:07, Alan Somers wrote:
>>
>>> Since upgrading my jail server to 11.1-RELEASE, the clock occasionally
>>> jumps backwards by 5-35 minutes for no apparent reason.  Has anybody seen
>>> something like this?
>>>
>>> Details
>>> =====
>>>
>>> * Happens about once a day on my jail server, and has happened at least
>>> once on a separate bhyve server.
>>>
>>> * The jumps almost always happen between 1 and 3 AM, but I've also seen
>>> them happen at 06:30 and 20:15.
>>>
>>> That's the window when the period scripts are run which if you have a
>> default configuration and a lot of jails will put the system under a lot of
>> stress.
>>
>
> That did not fail to escape my notice.  However, none of the jails'
> periodic jobs involve the clock in any way.  And I wouldn't think that a
> high CPU load could cause clock drift, could it?  This isn't Windows XP,
> after all.
>
>
>> * I'm using the default ntp.conf file.
>>>
>>> Are you running ntpd inside the jail or on the jail host? On my jail
>> systems (which are 10.3 and 11.1) I run ntpd out the jail host (outside all
>> jails) and not inside the jails and the jails then get the accurate time as
>> the underlying host has accurate time.
>>
>
> Only on the host.
>
> New info: there is a possibility that my NFS server is hanging for
> awhile.  That would explain my problem's timing.  However, ntpd shouldn't
> be accessing any NFS shares, and I wouldn't think that a hung NFS server
> should be able to pause the clock.  I'm doing a new experiment that should
> be more informative.  But I'll have to wait until the problem recurs to
> learn anything.
>

I have a little more data now.  The problem happens much more frequently
than I originally realized, but usually for just a few seconds at a time.
It looks like the system is hanging for awhile and then recovering.  Or at
least, the clocks are hanging.  The only other possibility would be for
both the realtime _and_ monotonic clocks to jump backwards.  In any case,
the problem is not ntpd's fault.  I don't know what could cause a system to
hang for up to 30 minutes without crashing, and I'm not sure how to tell
unless it happens during working hours.  I'll send another update if I
learn more.

-Alan



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2jXk40O2jTSwmJQ-f=mrREzvc7jjZ4=mxBRDavwbxG5mg>