Date: Sun, 12 Jul 2015 12:48:49 -0600 From: Ian Lepore <ian@freebsd.org> To: Peter Jeremy <peter@rulingia.com> Cc: freebsd-stable@freebsd.org Subject: Re: Will 10.2 also ship with a very stale NTP? Message-ID: <1436726929.1334.202.camel@freebsd.org> In-Reply-To: <20150712183140.GB22240@server.rulingia.com> References: <20150710235810.GA76134@rwpc16.gfn.riverwillow.net.au> <20150712032256.GB19305@satori.lan> <20150712050443.GA22240@server.rulingia.com> <20150712154416.b9f3713893fe28bfab1dd4d7@dec.sakura.ne.jp> <CAGMYy3vKEUCD=Ssxt%2B2Vny4eQ7CNQHTxNKncyQnRk5dPQU6ZtA@mail.gmail.com> <20150712184910.2d8d5f085ae659d5b9a2aba0@dec.sakura.ne.jp> <1436715703.1334.193.camel@freebsd.org> <20150712183140.GB22240@server.rulingia.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 2015-07-13 at 04:31 +1000, Peter Jeremy wrote: > On 2015-Jul-12 09:41:43 -0600, Ian Lepore <ian@freebsd.org> wrote: > >And let's all just hope that a week or two of testing is enough when > >jumping a major piece of software forward several years in its > >independent evolution. > > Whilst I support John's desire for NTP to be updated, I also do not > think this is the appropriate time to do so. That said, the final > decision is up to re@. > > >The import of 4.2.8p2 several months ago resulted in complete failure of > >timekeeping on all my arm systems. Just last week I tracked it down to > >a kernel bug (which I haven't committed the fix for yet). While the bug > >has been in the kernel for years, it tooks a small change in ntpd > >behavior to trigger it. > > > >Granted it's an odd corner-case problem that won't affect most users > >because they just use the stock ntp.conf file (and it only affects > >systems that have a large time step due to no battery-backed clock). > >But it took me weeks to find enough time to track down the cause of the > >problem. > > I'm not using the stock ntp.conf on my RPis and didn't notice any NTP > issues. Are you able to provide more details of either the ntp.conf > options that trigger the bug or the kernel bug itself? A quick search > failed to find anything. > I just committed the kernel fix as r285424; the commit message has some info on why the new ntpd made the problem visible. I should have said "stock rc.conf and ntp.conf"... To get the problem to happen you've got to set rc.conf ntpd_sync_on_start=NO and allow ntpd to make a large step (-g without -q, or tinker panic 0). I don't remember why I had sync on start disabled on most of my arm systems (probably a one-time experiment that I forgot to undo and it got copied around), but I suspect most people who don't have battery clocks will have it set to yes, and that's why nobody else saw this problem. To me, the problem was mainly illustrative of how a tiny innocuous change (ntpd making a series of ntp_adjtime() calls in a different, but still correct, order than it used to) can expose a completely unexpected longstanding bug in our code. Gotta wonder if any more of those are lurking. :/ -- Ian
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1436726929.1334.202.camel>