Date: Fri, 6 Dec 2002 22:18:48 +0200 From: Enache Adrian <enache@rdslink.ro> To: FreeBSD-gnats-submit@FreeBSD.org Subject: kern/46036: inaccurate timeouts in select(),nanosleep() + fix Message-ID: <200212062018.gB6KIm8W001891@ratsnest.org>
next in thread | raw e-mail | index | archive | help
>Number: 46036
>Category: kern
>Synopsis: inaccurate timeouts in select(),nanosleep() + fix
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Fri Dec 06 12:20:01 PST 2002
>Closed-Date:
>Last-Modified:
>Originator: Enache Adrian
>Release: FreeBSD 5.0-CURRENT i386
>Organization:
>Environment:
System: FreeBSD ratsnest.hole 5.0-CURRENT FreeBSD 5.0-CURRENT #1: Fri Dec 6 01:46:22 EET 2002 root@ratsnest.hole:/opt/hack/sys/i386/compile/CUBATAO i386
>Description:
select(),nanosleep(),poll() sleep one tick longer, _except_
when previous similar system call was interrupted by a signal
or by i/o conditions on the polled file descriptors.
I find this rather buggy and inconsequent.
It badly breaks any code which tries to hook both timers
and input polling on a select() loop and defeats benchmark
programs which test for the accuracy of timeouts.
This is due to tvtohz (sys/kern/kern_clock.c) which always
adds one to its result to allow for the current tick to
expire. Instead of adding one it should be better testing
for a non-zero result.
I modified like this my 4.4 kernel long time ago and have
seen so far no trouble at all.Same thing more recently with 5.0.
>How-To-Repeat:
This little program prints the timeout accuracy of the
select() system call as "[+-] sec usec".It catches SIGQUIT
and polls for console input.
It use to give:
unpatched (regular) kernel:
+ 0 s 9938 us
--input
+ 0 s 9959 us
+ 0 s 9972 us
+ 0 s 9930 us
^\ --interrupt
+ 0 s 9954 us
+ 0 s 9954 us
patched kernel:
- 0 s 44 us
^\ --interrupt
- 0 s 3 us
- 0 s 100 us
--input
- 0 s 44 us
- 0 s 47 us
-------------------------------8x-----------------------------------
#include <sys/types.h>
#include <sys/time.h>
#include <sys/sysctl.h>
#include <unistd.h>
#include <errno.h>
#include <signal.h>
#include <time.h>
#include <err.h>
#include <stdio.h>
fd_set rdfs;
int ms;
void sig(int unused) {}
int main(int argc,char **argv)
{
int k,mib[2];
struct timeval tv,savtv,otv;
struct clockinfo clock;
char buf[256];
signal(SIGQUIT,sig);
if (argc > 1) ms = atoi(argv[1]);
if (!ms) ms = 500;
savtv.tv_usec = ms % 1000 * 1000; savtv.tv_sec = ms / 1000;
mib[0] = CTL_KERN; mib[1] = KERN_CLOCKRATE;
k = sizeof clock;
if (sysctl(mib,2,&clock,&k,NULL,0)) err(1,"sysctl");
savtv.tv_usec -= savtv.tv_usec % clock.tick;
redo:
tv = savtv; gettimeofday(&otv,NULL);
/* a quick fix is to substract one tick from tv here
but this will break things when compiled on other
OS ( linux for instance ) */
for(;;) {
FD_SET(0,&rdfs);
k = select(1,&rdfs,NULL,NULL,&tv);
gettimeofday(&tv,NULL);
timersub(&tv,&otv,&tv);
switch(k) {
case 0:
if (timercmp(&tv,&savtv,<))
{ timersub(&savtv,&tv,&tv);k = '-'; }
else
{ timersub(&tv,&savtv,&tv);k = '+'; }
printf("%c%4d s %10d us\n",k,tv.tv_sec,tv.tv_usec);
goto redo;
case -1:
if (errno != EINTR) err(1,"select");
printf("\t--interrupt\n");
break;
default:
read(0,buf,256);
printf("\t--input\n");
break;
}
timersub(&savtv,&tv,&tv);
if (tv.tv_sec < 0) timerclear(&tv);
}
}
-------------------------------8x-----------------------------------
>Fix:
patch to kern_clock.c and other places which assume that
tvtohz adds one follows.
notes:
tvtohz has been renamed to hzto in NetBSD but is the same.
of course one can write
ticks = (sec * hz) + (((unsigned long)usec + (tick - 1)) / tick) ?: 1;
but this is a GCC extension.
-------------------------------8x-----------------------------------
diff -Naur osrc/sys/kern/kern_clock.c src/sys/kern/kern_clock.c
--- osrc/sys/kern/kern_clock.c Tue Nov 19 23:58:52 2002
+++ src/sys/kern/kern_clock.c Mon Dec 2 23:47:36 2002
@@ -257,8 +257,7 @@
* If the number of usecs in the whole seconds part of the time
* difference fits in a long, then the total number of usecs will
* fit in an unsigned long. Compute the total and convert it to
- * ticks, rounding up and adding 1 to allow for the current tick
- * to expire. Rounding also depends on unsigned long arithmetic
+ * ticks. Rounding also depends on unsigned long arithmetic
* to avoid overflow.
*
* Otherwise, if the number of ticks in the whole seconds part of
@@ -291,14 +290,15 @@
ticks = 1;
} else if (sec <= LONG_MAX / 1000000)
ticks = (sec * 1000000 + (unsigned long)usec + (tick - 1))
- / tick + 1;
+ / tick;
else if (sec <= LONG_MAX / hz)
- ticks = sec * hz
- + ((unsigned long)usec + (tick - 1)) / tick + 1;
+ ticks = sec * hz + ((unsigned long)usec + (tick - 1)) / tick;
else
ticks = LONG_MAX;
if (ticks > INT_MAX)
ticks = INT_MAX;
+ else if (ticks == 0)
+ ticks = 1;
return ((int)ticks);
}
diff -Naur osrc/sys/kern/kern_time.c src/sys/kern/kern_time.c
--- osrc/sys/kern/kern_time.c Tue Dec 3 00:14:05 2002
+++ src/sys/kern/kern_time.c Mon Dec 2 23:33:04 2002
@@ -527,10 +527,6 @@
* Else compute next time timer should go off which is > current time.
* This is where delay in processing this timeout causes multiple
* SIGALRM calls to be compressed into one.
- * tvtohz() always adds 1 to allow for the time until the next clock
- * interrupt being strictly less than 1 clock tick, but we don't want
- * that here since we want to appear to be in sync with the clock
- * interrupt even when we're delayed.
*/
void
realitexpire(void *arg)
@@ -555,7 +551,7 @@
if (timevalcmp(&p->p_realtimer.it_value, &ctv, >)) {
ntv = p->p_realtimer.it_value;
timevalsub(&ntv, &ctv);
- callout_reset(&p->p_itcallout, tvtohz(&ntv) - 1,
+ callout_reset(&p->p_itcallout, tvtohz(&ntv),
realitexpire, p);
splx(s);
PROC_UNLOCK(p);
diff -Naur osrc/sys/kern/kern_timeout.c src/sys/kern/kern_timeout.c
--- osrc/sys/kern/kern_timeout.c Thu Sep 5 14:42:03 2002
+++ src/sys/kern/kern_timeout.c Mon Dec 2 23:56:45 2002
@@ -399,15 +399,17 @@
return;
else if (time_change->tv_sec <= LONG_MAX / 1000000)
delta_ticks = (time_change->tv_sec * 1000000 +
- time_change->tv_usec + (tick - 1)) / tick + 1;
+ time_change->tv_usec + (tick - 1)) / tick;
else if (time_change->tv_sec <= LONG_MAX / hz)
delta_ticks = time_change->tv_sec * hz +
- (time_change->tv_usec + (tick - 1)) / tick + 1;
+ (time_change->tv_usec + (tick - 1)) / tick;
else
delta_ticks = LONG_MAX;
if (delta_ticks > INT_MAX)
delta_ticks = INT_MAX;
+ else if (delta_ticks == 0)
+ delta_ticks = 1;
/*
* Now rip through the timer calltodo list looking for timers
diff -Naur osrc/sys/net/bpf.c src/sys/net/bpf.c
--- osrc/sys/net/bpf.c Tue Nov 19 23:58:58 2002
+++ src/sys/net/bpf.c Mon Dec 2 23:58:24 2002
@@ -774,12 +774,8 @@
{
struct timeval *tv = (struct timeval *)addr;
- /*
- * Subtract 1 tick from tvtohz() since this isn't
- * a one-shot timer.
- */
if ((error = itimerfix(tv)) == 0)
- d->bd_rtout = tvtohz(tv) - 1;
+ d->bd_rtout = tvtohz(tv);
break;
}
-------------------------------8x-----------------------------------
>Release-Note:
>Audit-Trail:
>Unformatted:
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200212062018.gB6KIm8W001891>
