From owner-freebsd-hackers@FreeBSD.ORG Fri Oct 3 16:17:01 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1A2F242A; Fri, 3 Oct 2014 16:17:01 +0000 (UTC) Received: from mho-02-ewr.mailhop.org (mho-02-ewr.mailhop.org [204.13.248.72]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CDE451F6; Fri, 3 Oct 2014 16:17:00 +0000 (UTC) Received: from [73.34.117.227] (helo=ilsoft.org) by mho-02-ewr.mailhop.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.72) (envelope-from ) id 1Xa5XO-000175-Ua; Fri, 03 Oct 2014 16:16:59 +0000 Received: from [172.22.42.240] (revolution.hippie.lan [172.22.42.240]) by ilsoft.org (8.14.9/8.14.9) with ESMTP id s93GGv94021728; Fri, 3 Oct 2014 10:16:57 -0600 (MDT) (envelope-from ian@FreeBSD.org) X-Mail-Handler: Dyn Standard SMTP by Dyn X-Originating-IP: 73.34.117.227 X-Report-Abuse-To: abuse@dyndns.com (see http://www.dyndns.com/services/sendlabs/outbound_abuse.html for abuse reporting information) X-MHO-User: U2FsdGVkX194pVtioz1Omup/2aJae0vI X-Authentication-Warning: paranoia.hippie.lan: Host revolution.hippie.lan [172.22.42.240] claimed to be [172.22.42.240] Subject: Re: freebsd 10 kqueue timer regression From: Ian Lepore To: John Baldwin In-Reply-To: <2499075.KMdpQjyIZI@ralph.baldwin.cx> References: <8ABC0977-FB8F-45E7-ACCC-BFA92EE22E1C@glccom.com> <1412288106.12052.39.camel@revolution.hippie.lan> <1412297389.12052.46.camel@revolution.hippie.lan> <2499075.KMdpQjyIZI@ralph.baldwin.cx> Content-Type: text/plain; charset="us-ascii" Date: Fri, 03 Oct 2014 10:16:56 -0600 Message-ID: <1412353016.12052.82.camel@revolution.hippie.lan> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@freebsd.org, Adrian Chadd , Paul Albrecht X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Oct 2014 16:17:01 -0000 On Fri, 2014-10-03 at 08:50 -0400, John Baldwin wrote: > On Thursday, October 02, 2014 06:49:49 PM Ian Lepore wrote: > > On Thu, 2014-10-02 at 16:15 -0600, Ian Lepore wrote: > > > On Thu, 2014-10-02 at 16:00 -0400, John Baldwin wrote: > > > > On Thursday, October 02, 2014 3:53:28 pm Ian Lepore wrote: > > > > > On Thu, 2014-10-02 at 12:47 -0700, Adrian Chadd wrote: > > > > > > I'm confused; it's doing 50 loops of a 20msec timer, right? So > > > > > > that's > > > > > > > > 1000ms. > > > > > > > > > Yes, so the entire loop should take 1000ms maybe + 1ms. Instead it > > > > > takes 1070. When I run it on an armv6 system running -current it > > > > > takes > > > > > 1050. When I run it on my 8.4 desktop (pre-eventtimers) it takes > > > > > 1013. > > > > > > > > > > -- Ian > > > > > > > > What if you set kern.eventtimer.periodic=1? > > > > > > Some interesting results... > > > > > > HZ 100 500 1000 > > > > > > --------------------------------- > > > periodic=0 1050 1050 1080 > > > periodic=1 1110 1012 1049 > > > > > > > > > The 1080 number was +/- 3ms, all the other numbers were +/- 1ms (except > > > for one outlier of 24363 at 100Hz non-periodic which I'm going to > > > pretend didn't happen). > > > > > > The 1050 numbers are probably each 20ms sleep actually taking 21ms, but > > > the old tvtohz code with -1 adjustments from the old email thread isn't > > > in play anymore. I don't know how to account for the other numbers at > > > all. There's all kinds of stuff I don't understand in the new code > > > involving tick thresholds and such. > > > > > > -- Ian > > > > The attached patch seems to fix the problem in what I think is the most > > correct way: scheduling the callout with absolute times based on the > > time the current event was scheduled for plus the requested interval. > > The net effect should be metronomic events that do not drift (or phase > > shift if you prefer) over time, regardless of any latency involved in > > processing the events. > > > > This makes all the numbers in the tests I ran above come out 1000. > > > > It doesn't make me understand the strange results from the prior tests > > any better. > > > > -- Ian > > Are you running ntpd or ptpd? If so, perhaps try the original tests without > the patch. > Yep, ntpd is running, and I was repeatedly running the test right after rebooting to change kern.hz. I realized that's the explanation for the 24-second outlier (which didn't take 24 seconds in reality) -- the hardware I'm testing on has no RTC, and ntpd must have stepped the clock 24 seconds during the 1 second test loop. I think ntpd probably explains another anomaly I saw in more testing after that post last night. I tested with a 20uS period instead of 20mS and I kept seeing the loop take ~999810uS. Initially I thought 190ppm of clock error was a bit much to explain away easily, then I realized it's quite likely that ntpd is steering hard to get on-time. (The normal drift rate for the timecounter clock on the system is ~18ppm.) Taking ntpd out of the picture will remove artifacts from its clock steering, but really then what the test program loop is doing is measuring a pair of unsteered clocks against each other. Depending on the hardware, those clocks may derive from a common oscillator or not (in which case they are most assuredly keeping time at slightly different rates, for such is the nature of clocks). With the hardware I'm testing on, I think I can arrange to have eventtimers and timecounters both running from the same underlying oscillator, then we should be truly measuring only the software behavior. > That said, I think one of the reasons the old code worked was that the > previous callout had the equivalent of the C_HARDCLOCK flag set. Thus, when > the timer interrupt fires and we rescheuled for N ticks, it was actually N > ticks -