From owner-freebsd-hackers@FreeBSD.ORG Tue Jul 25 12:31:48 2006 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1A00416A4DA for ; Tue, 25 Jul 2006 12:31:48 +0000 (UTC) (envelope-from scheidell@secnap.net) Received: from secnap2.secnap.com (secnap2.secnap.com [204.89.241.128]) by mx1.FreeBSD.org (Postfix) with ESMTP id A6D0843D45 for ; Tue, 25 Jul 2006 12:31:47 +0000 (GMT) (envelope-from scheidell@secnap.net) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable content-class: urn:content-classes:message X-MimeOLE: Produced By Microsoft Exchange V6.0.6603.0 Date: Tue, 25 Jul 2006 08:31:47 -0400 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: FBSD 5.5 and software timers Thread-Index: AcavcHUGcLvjUUzgRPKOnUNzdYzkXwAcZPNg From: "Michael Scheidell" To: Subject: RE: FBSD 5.5 and software timers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Jul 2006 12:31:48 -0000 > -----Original Message----- > From: Steve Watt [mailto:steve@Watt.COM]=20 > Sent: Monday, July 24, 2006 6:28 PM > To: Michael Scheidell > Cc: hackers@freebsd.com > Subject: Re: FBSD 5.5 and software timers > It sounds like ntpd isn't really synchronizing. If you keep=20 > an eye on associations (my finger-memory command is 'ntpq -c=20 > peer -c assoc -c rv') over time, you may notice that your=20 > machine never decides to synchronize. Especially interesting=20 > is the 'condition' column in the 'assoc' output. Ntp hasn't changed from 5.4 to 5.4, still acts the same. Ntp versions look like they are the same, only thing changed is kernel. ntpdc -c peers (same command) does show the * on one of them, so I assume=20 Also note, this happens across 20 different computers (ntp) but only the 5.5 built systems have a problem with nanosleep(). I guess it depends on the design reason: Did freebsd decide to change how nanosleep() worked on 5.5? Or is this a bug? On 5.4, if you set a 200ms sleep: nanosleep(200*1000) it triggered reasonably close to 200ms, no matter if the wallclock went forward or back. On 5.5, if at 12:32.20000000 you set a 200ms sleep and changed the wallclock with ntpd back 9 seconds , nanosleep expires in 9 seconds plus 200ms. Here is a 5.5 system: 20ghz, no HTT, ntpd started with -x: If I started it without the -x offsets would be in the -2 or +1 area. ntpq -c peer -c assoc -c rv remote refid st t when poll reach delay offset jitter =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D LOCAL(0) LOCAL(0) 10 l 6 64 377 0.000 0.000 0.004 +a77.coleman.edu 204.152.184.72 2 u 668 1024 377 119.466 -35277. 422.506 *meow.febo.com 192.168.1.230 2 u 319 1024 377 87.445 -35443. 436.095 +216-228-12-34.d 216.218.192.202 2 u 649 1024 377 143.816 -35295. 441.571 ind assID status conf reach auth condition last_event cnt =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D 1 14084 9014 yes yes none reject reachable 1 2 14085 b414 yes yes none candidat reachable 1 3 14086 b614 yes yes none sys.peer reachable 1 4 14087 b414 yes yes none candidat reachable 1 status=3D0684 leap_none, sync_ntp, 8 events, event_peer/strat_chg, version=3D"ntpd 4.2.0-a Wed Jul 5 17:49:24 EDT 2006 (1)", processor=3D"i386", system=3D"FreeBSD/5.5-RELEASE-p2", leap=3D00, = stratum=3D3, precision=3D-18, rootdelay=3D88.208, rootdispersion=3D35861.579, = peer=3D14086, refid=3D24.123.66.139, reftime=3Dc87086c3.b381f74c Tue, Jul 25 2006 8:02:11.701, poll=3D10, clock=3Dc8708802.e65819f6 Tue, Jul 25 2006 8:07:30.899, state=3D4, offset=3D-35343.172, frequency=3D422.004, jitter=3D462.419, = stability=3D0.119 Does anyone know if this is a design change? The handling of nanosleep() and the wallclock? Does 6.1 do this? If if 6.1 works, does this mean that 6.2 will break it also? What about posix 'real time' timers? Would I have better luck with them across os versions? >=20 > As for whys, it rather depends on what you see from the ntpq commands. >=20 Could still be, but there seems to be no way to fix it. This isnt a hardware problem, its 20 different computers, ibm 300, 305,306's, p4 2.8, and 2.0's, Dell 750's and 850's with p4 2.8's with HTT.