From owner-freebsd-current@FreeBSD.ORG  Thu Jul 22 16:25:21 2004
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id E905316A4CE; Thu, 22 Jul 2004 16:25:21 +0000 (GMT)
Received: from mailout2.pacific.net.au (mailout2.pacific.net.au [61.8.0.85])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 6FC7843D31; Thu, 22 Jul 2004 16:25:21 +0000 (GMT)
	(envelope-from bde@zeta.org.au)
Received: from mailproxy1.pacific.net.au (mailproxy1.pacific.net.au
	[61.8.0.86])i6MGPJje027923;	Fri, 23 Jul 2004 02:25:19 +1000
Received: from epsplex.bde.org (katana.zip.com.au [61.8.7.246])
	i6MGPHn4011667;	Fri, 23 Jul 2004 02:25:18 +1000
Date: Fri, 23 Jul 2004 02:25:16 +1000 (EST)
From: Bruce Evans <bde@zeta.org.au>
X-X-Sender: bde@epsplex.bde.org
To: John Birrell <jb@cimlogic.com.au>
In-Reply-To: <20040722225952.S1704@epsplex.bde.org>
Message-ID: <20040723014517.B2451@epsplex.bde.org>
References: <20040721081310.GJ22160@freebsd3.cimlogic.com.au>
	<20040721215940.GK22160@freebsd3.cimlogic.com.au>
	<20040722225952.S1704@epsplex.bde.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
cc: current@freebsd.org
Subject: Re: nanosleep returning early
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 22 Jul 2004 16:25:22 -0000

On Fri, 23 Jul 2004, Bruce Evans wrote:

> On Thu, 22 Jul 2004, John Birrell wrote:
>
> > On Wed, Jul 21, 2004 at 11:01:20PM +1000, Bruce Evans wrote:
> > > ...
> > > The most obvious bug is that nanosleep() uses the low-accuracy interface
> > > getnanouptime().  I can't see why the the problem is more obvious with
> > > large HZ or why it affects short sleeps.  From kern_time.c 1.170:
> > > ...
> > > % 	getnanouptime(&ts);
> > >
> > > This may lag the actual (up)time by 1/HZ seconds.
>
> [                       (actually tc_tick/HZ seconds)

> > So, does increasing HZ expose the lower accuracy of getnanouptime() and is
> > that what I'm seeing?
>
> I still don't know the reason.  Unfortunately, I deleted your original
> mail so I can't run the test program in it easily.

Now I think I know the reason.  The interval between clock interrupts
is supposed to be 1/HZ seconds = `tick' microseconds, but it cannot
be set nearly that precisely, and the imprecision of inversely
proportional to HZ.  The i8254 counter has a default nominal frequency
of 1193182 Hz.  Suppose that this is perfectly accurate.  Then to
implement clock interrupts at HZ hz, we want to program the i8254's
maximum count to 1193182/HZ in infinite precision, but counts must be
integers so we must round.  The loss of precision is quite large for
HZ = 1000: 1193182 / 1000.0 = 1193.182; rounding this (to nearest)
gives 1193 and an error of 182 in 1193182 = 152 ppm.  Also, the extra
tick added by tvtohz() is only 1000 uS long, so it only has a chance
of about 152/1000 to compensate for the rounding error.  Finally, the
explicit check that the interval has elapsed cannot compensate for
errors larger than tc_tick/HZ because getnanouptime() is fuzzy.

Rounding 1193.182 to nearest happens to round down; thus clock ticks
are shorter than `tick' microseconds, tvtohz()'s value is too small,
and nanosleep() may return too early.  The loop limits the error to
about 1 tick in this case.  The i8254 frequency may be calibrated or
set using sysctl to a more (or less) accurate value.  Then the rounding
may go the other way so that tvtohz()'s value is too large and nanoleep()
may return too late.  The loop cannot limit the error in this case.
The absolute error may be large for long sleeps.  E.g., 152 ppm over
1 day is 13 seconds.

tvtohz()'s value  may also be too large because the i8254 frequency
is not known accurately.  It's nominal value is wrong by 10-100 Hz
on my systems.  I minimize errors from this by calibrating all
timecounters using a common clock.

Bruce