From owner-freebsd-stable@FreeBSD.ORG Mon Feb 22 11:41:07 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 358D6106568F for ; Mon, 22 Feb 2010 11:41:07 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta03.emeryville.ca.mail.comcast.net (qmta03.emeryville.ca.mail.comcast.net [76.96.30.32]) by mx1.freebsd.org (Postfix) with ESMTP id 1792D8FC0A for ; Mon, 22 Feb 2010 11:41:06 +0000 (UTC) Received: from omta15.emeryville.ca.mail.comcast.net ([76.96.30.71]) by qmta03.emeryville.ca.mail.comcast.net with comcast id kzgp1d0061Y3wxoA3zh7wN; Mon, 22 Feb 2010 11:41:07 +0000 Received: from koitsu.dyndns.org ([98.248.46.159]) by omta15.emeryville.ca.mail.comcast.net with comcast id kzh61d0073S48mS8bzh6cX; Mon, 22 Feb 2010 11:41:07 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 8228E1E301A; Mon, 22 Feb 2010 03:41:05 -0800 (PST) Date: Mon, 22 Feb 2010 03:41:05 -0800 From: Jeremy Chadwick To: freebsd-stable@freebsd.org Message-ID: <20100222114105.GA96234@icarus.home.lan> References: <20100217194927.e3ec60ae.torfinn.ingolfsen@broadpark.no> <20100217200322.da66c9f8.torfinn.ingolfsen@broadpark.no> <20100218205458.GA78560@server.vk2pj.dyndns.org> <20100218231223.ec6b9fa8.torfinn.ingolfsen@broadpark.no> <20100219003844.acdaa866.torfinn.ingolfsen@broadpark.no> <20100220015351.GB81639@server.vk2pj.dyndns.org> <20100220223201.178e67dd.torfinn.ingolfsen@broadpark.no> <20100221050823.GB22670@server.vk2pj.dyndns.org> <4b82483e.5OXNba8+J2F18v3D%perryh@pluto.rain.com> <20100222111810.GD12891@server.vk2pj.dyndns.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20100222111810.GD12891@server.vk2pj.dyndns.org> User-Agent: Mutt/1.5.20 (2009-06-14) Subject: Re: ntpd struggling to keep up - how to fix? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Feb 2010 11:41:07 -0000 On Mon, Feb 22, 2010 at 10:18:10PM +1100, Peter Jeremy wrote: > On 2010-Feb-22 01:02:54 -0800, perryh@pluto.rain.com wrote: > >Peter Jeremy wrote: > > > >> ... Once ntpd decides to continuously step, something is broken. > > > >Is there some reason why, as long as it is not yet synced, ntpd > >should not do this sort of calculation and rate correction itself > >rather than insist on having a human perform the calculation and > >enter the adjustment? > > ntpd _does_ do this sort of calculation but the NTP algorithms > bound the PLL adjustment to +/-500ppm. RFC1305 suggests that > a reasonable tolerance for "board-mounted, uncompensated quartz- > crystal oscillators" is 100ppm and therefore the +/-500ppm bound > is reasonable (see the RFC for the gory maths). > > In this case, the op's clock was ~2500ppm slow - well outside > the NTP tolerance. It was therefore necessary to change the > nominal timecounter frequency to bring it into lock range. I > do not believe it is reasonable for ntpd to do this by itself: > - It should very rarely be needed since NTP should be able to > compensate for normal tolerances. > - The actual local clock source and how to alter the kernel's > idea of its nominal frequency is outside the purview of NTP. > - Giving ntpd free reign over the timecounter frequency runs > the real risk of ntpd rendering the system unusable if ntpd > becomes confused (or is mislead) about the time. > > Note that FreeBSD/i386 and /amd64 include 4 different possible > timecounters, only 3 of which can be tweaked. Other FreeBSD > architectures will have different timecounters. Other OSs may > have completely different mechanisms for handling the local > clock source. Trying to embed knowledge of all these different > clock sources into ntpd would be unrealistic. > > I look after over 100 assorted Unix hosts at home and work (HP > AlphaServers and Proliants, various Sun servers, Dell and whitebox PCs > and various laptops) and the worst driftrates I have seen previously > are: > - Sun T-2000 servers have a design flaw in the clock spectrum > spreading so it appears to be ~250ppm fast. Sun fixed this > with a kernel patch that increases the nominal clock frequency. > - A Sun V20z is just over 100ppm out - I have tweaked the > relevant timecounter to compensate for this (to avoid triggering > my NTP frequency error alarms). > - 4 assorted Sun hosts that run 55-60ppm out. > > At least based on my sample, the only hosts that were anywhere near > ntpd's tolerance limits were acknowledged to have a design problem > and the vendor provided a fix. IMO, this is a better approach than > trying to make ntpd omniscient. A question with regards to the latter systems you mentioned (though I'm speaking generally and not specifically with regards to those H/W models), as I want to make sure I understand correctly: ntpd under normal operation (not +/- 500ppm) "figure out" on its own the average amount of drift, which is what ntpd.drift is for, correct? -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |