Date: Mon, 22 Feb 2010 22:18:10 +1100 From: Peter Jeremy <peterjeremy@acm.org> To: perryh@pluto.rain.com Cc: freebsd-stable@freebsd.org Subject: Re: ntpd struggling to keep up - how to fix? Message-ID: <20100222111810.GD12891@server.vk2pj.dyndns.org> In-Reply-To: <4b82483e.5OXNba8%2BJ2F18v3D%perryh@pluto.rain.com> References: <20100212174452.2140cd72.torfinn.ingolfsen@broadpark.no> <20100217194927.e3ec60ae.torfinn.ingolfsen@broadpark.no> <20100217200322.da66c9f8.torfinn.ingolfsen@broadpark.no> <20100218205458.GA78560@server.vk2pj.dyndns.org> <20100218231223.ec6b9fa8.torfinn.ingolfsen@broadpark.no> <20100219003844.acdaa866.torfinn.ingolfsen@broadpark.no> <20100220015351.GB81639@server.vk2pj.dyndns.org> <20100220223201.178e67dd.torfinn.ingolfsen@broadpark.no> <20100221050823.GB22670@server.vk2pj.dyndns.org> <4b82483e.5OXNba8%2BJ2F18v3D%perryh@pluto.rain.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--KN5l+BnMqAQyZLvT Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2010-Feb-22 01:02:54 -0800, perryh@pluto.rain.com wrote: >Peter Jeremy <peterjeremy@acm.org> wrote: > >> ... Once ntpd decides to continuously step, something is broken. > >Is there some reason why, as long as it is not yet synced, ntpd >should not do this sort of calculation and rate correction itself >rather than insist on having a human perform the calculation and >enter the adjustment? ntpd _does_ do this sort of calculation but the NTP algorithms bound the PLL adjustment to +/-500ppm. RFC1305 suggests that a reasonable tolerance for "board-mounted, uncompensated quartz- crystal oscillators" is 100ppm and therefore the +/-500ppm bound is reasonable (see the RFC for the gory maths). In this case, the op's clock was ~2500ppm slow - well outside the NTP tolerance. It was therefore necessary to change the nominal timecounter frequency to bring it into lock range. I do not believe it is reasonable for ntpd to do this by itself: - It should very rarely be needed since NTP should be able to compensate for normal tolerances. - The actual local clock source and how to alter the kernel's idea of its nominal frequency is outside the purview of NTP. - Giving ntpd free reign over the timecounter frequency runs the real risk of ntpd rendering the system unusable if ntpd becomes confused (or is mislead) about the time. Note that FreeBSD/i386 and /amd64 include 4 different possible timecounters, only 3 of which can be tweaked. Other FreeBSD architectures will have different timecounters. Other OSs may have completely different mechanisms for handling the local clock source. Trying to embed knowledge of all these different clock sources into ntpd would be unrealistic. I look after over 100 assorted Unix hosts at home and work (HP AlphaServers and Proliants, various Sun servers, Dell and whitebox PCs and various laptops) and the worst driftrates I have seen previously are: - Sun T-2000 servers have a design flaw in the clock spectrum spreading so it appears to be ~250ppm fast. Sun fixed this with a kernel patch that increases the nominal clock frequency. - A Sun V20z is just over 100ppm out - I have tweaked the relevant timecounter to compensate for this (to avoid triggering my NTP frequency error alarms). - 4 assorted Sun hosts that run 55-60ppm out. At least based on my sample, the only hosts that were anywhere near ntpd's tolerance limits were acknowledged to have a design problem and the vendor provided a fix. IMO, this is a better approach than trying to make ntpd omniscient. --=20 Peter Jeremy --KN5l+BnMqAQyZLvT Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAkuCZ/IACgkQ/opHv/APuIcbMwCgrMm5mmxhAdIXWfJfwO95jcBP 0WIAoLCFa8rQeTcv+JHDZ6xD1FIzrzhk =dS3Y -----END PGP SIGNATURE----- --KN5l+BnMqAQyZLvT--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100222111810.GD12891>