From owner-freebsd-bugs Sat Dec 7 5: 0:22 2002 Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1E5E937B401 for ; Sat, 7 Dec 2002 05:00:19 -0800 (PST) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id B3E6E43EA9 for ; Sat, 7 Dec 2002 05:00:18 -0800 (PST) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.12.6/8.12.6) with ESMTP id gB7D0Ix3089254 for ; Sat, 7 Dec 2002 05:00:18 -0800 (PST) (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.12.6/8.12.6/Submit) id gB7D0I3w089250; Sat, 7 Dec 2002 05:00:18 -0800 (PST) Date: Sat, 7 Dec 2002 05:00:18 -0800 (PST) Message-Id: <200212071300.gB7D0I3w089250@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org Cc: From: Bruce Evans Subject: Re: kern/46036: inaccurate timeouts in select(),nanosleep() + fix Reply-To: Bruce Evans Sender: owner-freebsd-bugs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org The following reply was made to PR kern/46036; it has been noted by GNATS. From: Bruce Evans To: Enache Adrian Cc: FreeBSD-gnats-submit@FreeBSD.ORG Subject: Re: kern/46036: inaccurate timeouts in select(),nanosleep() + fix Date: Sat, 7 Dec 2002 23:57:40 +1100 (EST) On Fri, 6 Dec 2002, Enache Adrian wrote: > >Description: > select(),nanosleep(),poll() sleep one tick longer, _except_ > when previous similar system call was interrupted by a signal > or by i/o conditions on the polled file descriptors. > > I find this rather buggy and inconsequent. > It badly breaks any code which tries to hook both timers > and input polling on a select() loop and defeats benchmark > programs which test for the accuracy of timeouts. > > This is due to tvtohz (sys/kern/kern_clock.c) which always > adds one to its result to allow for the current tick to > expire. Instead of adding one it should be better testing > for a non-zero result. tvtohz() actually adds 1 to ensure that the sleep time is at least the specified time. This gives a minimum sleep time of 0.0 ticks longer than the specified sleep time (as required by standards) and a maximum sleep time of about 1.0 tick longer than the specified sleep time, with an average sleep time of about 0.5 ticks longer. Not adding 1 would give a {min,max,avg} sleep time of about {-1.0,0.0,-0.5} ticks longer than specified; i.e., it would be shorter than specified (broken) in most cases. Individual syscalls could handle this by sleeping for the shorter time and checking whether the timeout has actually expired when they wake up, but most don't, and this would be a pessimization in most cases. Of the above syscalls, only nanosleep() checks that the timeout has actually expired. It does this mainly for the technical reason that very large timeouts (ones greater than INT_MAX ticks) are required to work and may even be useful (especially if HZ is large). Checking is a pessimization in at least the following cases, because we will just have to wait again after wasting time waking up to check: - most short timeouts that are a multiple of `tick'. - half or the cases for random timeouts of 0.5 ticks (since the current tick will expire before the timeout half the time), and proportionally for all random timeouts of 1 tick or shorter. > >How-To-Repeat: > This little program prints the timeout accuracy of the > select() system call as "[+-] sec usec".It catches SIGQUIT > and polls for console input. > It use to give: > > unpatched (regular) kernel: > + 0 s 9938 us > > --input > + 0 s 9959 us > + 0 s 9972 us > + 0 s 9930 us > ^\ --interrupt > + 0 s 9954 us > + 0 s 9954 us This is the correct behaviour. We are mostly in sync with the clock interrupt, and the default timeout is a multiple of clock.tick, so we must sleep for an extra tick (less epsilon) for the timeout to expire. select() would have to go back to sleep for an extra tick if tvtohz() didn't add 1. > patched kernel: > - 0 s 44 us > ^\ --interrupt > - 0 s 3 us > - 0 s 100 us > > --input > - 0 s 44 us > - 0 s 47 us This is incorrect. The times are the values ((actual timeout) - (specified timeout)). These should be >= 0. The errors are small here because the timeout is a multiple of clock.tick and we are mostly in sync with clock interrupts. In general the error from not adding 1 is up to 1 tick (see above). > -------------------------------8x----------------------------------- > >Fix: > patch to kern_clock.c and other places which assume that > tvtohz adds one follows. This mainly undoes some FreeBSD fixes. > notes: > tvtohz has been renamed to hzto in NetBSD but is the same. Actually, FreeBSD renamed hzto to tvtohz. > redo: > tv = savtv; gettimeofday(&otv,NULL); > /* a quick fix is to substract one tick from tv here > but this will break things when compiled on other > OS ( linux for instance ) */ > for(;;) { > FD_SET(0,&rdfs); > k = select(1,&rdfs,NULL,NULL,&tv); > gettimeofday(&tv,NULL); I thought that Linux gets timeouts right -- at least sleep() used to work except in very early versions of Linux-0.x. Perhaps it uses clock interrupts to get to get close to the timeout and then udelay() to get closer if the residual timeout is small. In general, the residual could be up to `tick' usec, but in cases that are in sync with the clock like your test program, the residual would be small (about 44 usec) and udelay() could reasonably handle it. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message