From owner-svn-src-head@FreeBSD.ORG  Tue Dec 15 17:39:14 2009
Return-Path: <owner-svn-src-head@FreeBSD.ORG>
Delivered-To: svn-src-head@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C35E81065695;
	Tue, 15 Dec 2009 17:39:14 +0000 (UTC)
	(envelope-from pdegoeje@service2media.com)
Received: from s2m-is-001.service2media.com (rev-130-102.virtu.nl
	[217.114.102.130])
	by mx1.freebsd.org (Postfix) with ESMTP id 463588FC16;
	Tue, 15 Dec 2009 17:39:13 +0000 (UTC)
Received: from pieter-dev-linux.localnet ([10.0.1.18] RDNS failed) by
	s2m-is-001.service2media.com with Microsoft SMTPSVC(6.0.3790.3959); 
	Tue, 15 Dec 2009 18:39:11 +0100
From: Pieter de Goeje <pieter@service2media.com>
Organization: Service2Media
To: Bruce Evans <brde@optusnet.com.au>
Date: Tue, 15 Dec 2009 18:39:10 +0100
User-Agent: KMail/1.12.2 (Linux/2.6.31-16-generic; KDE/4.3.2; i686; ; )
References: <200912141223.nBECNlDZ026381@svn.freebsd.org>
	<200912151416.56348.pieter@service2media.com>
	<20091216014431.U1425@besplex.bde.org>
In-Reply-To: <20091216014431.U1425@besplex.bde.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <200912151839.11024.pieter@service2media.com>
X-OriginalArrivalTime: 15 Dec 2009 17:39:11.0793 (UTC)
	FILETIME=[833A4A10:01CA7DAD]
Cc: Luigi Rizzo <luigi@freebsd.org>,
	"src-committers@freebsd.org" <src-committers@freebsd.org>,
	"svn-src-all@freebsd.org" <svn-src-all@freebsd.org>,
	Robert Watson <rwatson@freebsd.org>,
	"svn-src-head@freebsd.org" <svn-src-head@freebsd.org>,
	Luigi Rizzo <rizzo@iet.unipi.it>,
	Pieter de Goeje <pdegoeje@service2media.com>
Subject: Re: svn commit: r200510 - head/sys/kern
X-BeenThere: svn-src-head@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: SVN commit messages for the src tree for head/-current
	<svn-src-head.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-head>,
	<mailto:svn-src-head-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-head>
List-Post: <mailto:svn-src-head@freebsd.org>
List-Help: <mailto:svn-src-head-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-head>,
	<mailto:svn-src-head-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Dec 2009 17:39:14 -0000

On Tuesday 15 December 2009 16:46:07 Bruce Evans wrote:
> On Tue, 15 Dec 2009, Pieter de Goeje wrote:
> > On Monday 14 December 2009 15:46:35 Luigi Rizzo wrote:
> >> On Mon, Dec 14, 2009 at 02:18:42PM +0000, Robert Watson wrote:
> >>> On Mon, 14 Dec 2009, Luigi Rizzo wrote:
> >>
> >> ...
> >>
> >>>> Together with a smaller patch committed in september, this fixes a
> >>>> bug that affects 8.0 with apps that rely on callouts to fire exactly
> >>>> in the number of ticks specified (qemu among them).
> >>>> Right now, callouts in 8.0 fire one tick late.
> >>>>
> >>>> This was discussed in september with JeffR and jhb
> >>>
> >>> Once this has burned in, is it something you would consider appropriate
> >>> to be an errata note candidate?
> >>
> >> i have no objection, but at the time someone commented that
> >> callouts do not _guarantee_ when they will run so strictly speaking
> >> this is not a bug (i do think that being always a tick late _is_ a bug).
> >
> > As a person running a couple of game servers which rely on nanosleep to
> > get a fixed number of frames per second, I'd say that it is a bug.
> 
> Being a tick late is certainly a bug.  Relying on nanosleep to get a
> fixed number of frames per second is another bug.  If you want a
> periodic timer, setitimer(2) with a nonzero it_value (so that the timer
> repeats automatically) must be used.
> 
> > This might also
> > affect video players which want to show their frames on time.  The
> > default HZ of 1000 mitigates the problem somewhat, but on for example a
> > laptop running at HZ=100 the error is noticeable.
> > To illustrate my point, calling usleep(1) 100 times in a loop results in
> > a running time of 3 seconds with kern.hz=100 (measured on 8.x from Dec
> > 9th), which is 3 times as long as one might reasonably expect. This
> > suggests that the callout fires 2 ticks late ...
> 
> Only 1 tick late.  I get a running time of 2 seconds with hz = 100 under
> FreeBSD-~5.2, presumably because 5.2 didn't have the 1-tick-late bug.
> 
> The time is expected to be 2 seconds instead of 1 because nanosleep()
> adds an extra 1 tick though it would work right (but slower) with other
> small changes (also pessimizations) if it didn't.  To sleep for 1
> microsecond, it is always necessary to wait until the next tick for
> obvious reasons.  The next tick might occur in less than a microsecond
> (when the timeout happens to be set up just before the tick), so
> nanosleep() can't just return when the tick occurs.  It should check
> if the timeout has expired (in real time, not ticks) and wait for
> another tick if not.  In fact, it already does this in order to be
> reasonably accurate for long timeouts.  However, to be simple and
> efficient, it just waits for an extra tick initially, using generic
> code that adds 1 to the tick count.  Other uses of the generic code
> don't check that the timeout has expired so they need this extra 1
> for correctness, but nanosleep() only needs it for efficiency.  This
> optimization for efficiency is more historical than intentional.
> nanosleep() also uses a fuzzy check (getnanouptime() instead of
> nanouptime() for the expiry, which can give an error of about 1 tick
> With the fuzzy check, the extra 1 might still be needed for correctness
> 
> Thus when hz = 100, 100 nanosleeps for 1 microsecond (or even 1
> nanosecond) take between about 1 and 2 seconds, with an average of 1.5
> seconds for random calls and an average of 2 seconds for synchronized
> calls.  Sequential calls with no other system activity give synchronized
> calls.  The time of 2/hz for synchronized calls can be depended on if
> there is no other system activity, but it is better to use setitimer()
> as above -- then cases with other system activity have a better chance
> of working, and you can also get a time of 1/hz.
> 
> nanosleep() is correct but very sloppy for  for long sleeps.  E.g., with
> hz = 1000 on i386(i8254), the average absolute error for a set of perfectly
> calibrated i8254 ticks is about 0.02%, so for sleeps of 1 year the error
> will be at best 1.82 hours on average.  If the hz clock runs faster than
> real time, then nanosleep() wakes up early and does shorter sleeps to
> reach the correct real time, but if the hz clock runs slow then nanosleep()
> normally wakes up hours per year late.  This is easy to fix at a cost of
> efficiency by intentionally underestimating the timeout in ticks, which
> goes well with not adding 1.
> 
> Bruce
> 
Thank you for your very thorough explanation. 
To recap, nanosleep() will always be one tick too late in the synchronous case 
unless a check of sleep time left is implemented in addition to not blindly 
adding 1 to the amount of ticks to wait. To get the minimum latency of 1/hz 
with the current implementation nanosleep() should be called right before the 
next tick fires.

- Pieter