From owner-freebsd-hackers@FreeBSD.ORG Fri Oct 3 14:04:17 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BE9A8AB9; Fri, 3 Oct 2014 14:04:17 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 94E33118; Fri, 3 Oct 2014 14:04:17 +0000 (UTC) Received: from ralph.baldwin.cx (pool-173-70-85-31.nwrknj.fios.verizon.net [173.70.85.31]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 7E2BCB9B0; Fri, 3 Oct 2014 10:04:16 -0400 (EDT) From: John Baldwin To: Ian Lepore Subject: Re: freebsd 10 kqueue timer regression Date: Fri, 03 Oct 2014 08:50:12 -0400 Message-ID: <2499075.KMdpQjyIZI@ralph.baldwin.cx> User-Agent: KMail/4.12.5 (FreeBSD/10.1-BETA2; KDE/4.12.5; amd64; ; ) In-Reply-To: <1412297389.12052.46.camel@revolution.hippie.lan> References: <8ABC0977-FB8F-45E7-ACCC-BFA92EE22E1C@glccom.com> <1412288106.12052.39.camel@revolution.hippie.lan> <1412297389.12052.46.camel@revolution.hippie.lan> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Fri, 03 Oct 2014 10:04:16 -0400 (EDT) Cc: freebsd-hackers@freebsd.org, Adrian Chadd , Paul Albrecht X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Oct 2014 14:04:17 -0000 On Thursday, October 02, 2014 06:49:49 PM Ian Lepore wrote: > On Thu, 2014-10-02 at 16:15 -0600, Ian Lepore wrote: > > On Thu, 2014-10-02 at 16:00 -0400, John Baldwin wrote: > > > On Thursday, October 02, 2014 3:53:28 pm Ian Lepore wrote: > > > > On Thu, 2014-10-02 at 12:47 -0700, Adrian Chadd wrote: > > > > > I'm confused; it's doing 50 loops of a 20msec timer, right? So > > > > > that's > > > > > > 1000ms. > > > > > > > Yes, so the entire loop should take 1000ms maybe + 1ms. Instead it > > > > takes 1070. When I run it on an armv6 system running -current it > > > > takes > > > > 1050. When I run it on my 8.4 desktop (pre-eventtimers) it takes > > > > 1013. > > > > > > > > -- Ian > > > > > > What if you set kern.eventtimer.periodic=1? > > > > Some interesting results... > > > > HZ 100 500 1000 > > > > --------------------------------- > > periodic=0 1050 1050 1080 > > periodic=1 1110 1012 1049 > > > > > > The 1080 number was +/- 3ms, all the other numbers were +/- 1ms (except > > for one outlier of 24363 at 100Hz non-periodic which I'm going to > > pretend didn't happen). > > > > The 1050 numbers are probably each 20ms sleep actually taking 21ms, but > > the old tvtohz code with -1 adjustments from the old email thread isn't > > in play anymore. I don't know how to account for the other numbers at > > all. There's all kinds of stuff I don't understand in the new code > > involving tick thresholds and such. > > > > -- Ian > > The attached patch seems to fix the problem in what I think is the most > correct way: scheduling the callout with absolute times based on the > time the current event was scheduled for plus the requested interval. > The net effect should be metronomic events that do not drift (or phase > shift if you prefer) over time, regardless of any latency involved in > processing the events. > > This makes all the numbers in the tests I ran above come out 1000. > > It doesn't make me understand the strange results from the prior tests > any better. > > -- Ian Are you running ntpd or ptpd? If so, perhaps try the original tests without the patch. That said, I think one of the reasons the old code worked was that the previous callout had the equivalent of the C_HARDCLOCK flag set. Thus, when the timer interrupt fires and we rescheuled for N ticks, it was actually N ticks -