From owner-freebsd-bugs Fri Dec 6 12:20: 9 2002 Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DF90537B401 for ; Fri, 6 Dec 2002 12:20:02 -0800 (PST) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1207543EC5 for ; Fri, 6 Dec 2002 12:20:02 -0800 (PST) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.12.6/8.12.6) with ESMTP id gB6KK1x3009687 for ; Fri, 6 Dec 2002 12:20:01 -0800 (PST) (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.12.6/8.12.6/Submit) id gB6KK1Nh009686; Fri, 6 Dec 2002 12:20:01 -0800 (PST) Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2DD4537B401 for ; Fri, 6 Dec 2002 12:18:40 -0800 (PST) Received: from ratsnest.org (213.157.186.184 [213.157.186.184]) by mx1.FreeBSD.org (Postfix) with ESMTP id 01BEA43EBE for ; Fri, 6 Dec 2002 12:18:32 -0800 (PST) (envelope-from enache@rdslink.ro) Received: (from adi@localhost) by ratsnest.org (8.12.1/8.12.1) id gB6KIm8W001891; Fri, 6 Dec 2002 22:18:48 +0200 Message-Id: <200212062018.gB6KIm8W001891@ratsnest.org> Date: Fri, 6 Dec 2002 22:18:48 +0200 From: Enache Adrian Reply-To: Enache Adrian To: FreeBSD-gnats-submit@FreeBSD.org X-Send-Pr-Version: 3.113 Subject: kern/46036: inaccurate timeouts in select(),nanosleep() + fix Sender: owner-freebsd-bugs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org >Number: 46036 >Category: kern >Synopsis: inaccurate timeouts in select(),nanosleep() + fix >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Fri Dec 06 12:20:01 PST 2002 >Closed-Date: >Last-Modified: >Originator: Enache Adrian >Release: FreeBSD 5.0-CURRENT i386 >Organization: >Environment: System: FreeBSD ratsnest.hole 5.0-CURRENT FreeBSD 5.0-CURRENT #1: Fri Dec 6 01:46:22 EET 2002 root@ratsnest.hole:/opt/hack/sys/i386/compile/CUBATAO i386 >Description: select(),nanosleep(),poll() sleep one tick longer, _except_ when previous similar system call was interrupted by a signal or by i/o conditions on the polled file descriptors. I find this rather buggy and inconsequent. It badly breaks any code which tries to hook both timers and input polling on a select() loop and defeats benchmark programs which test for the accuracy of timeouts. This is due to tvtohz (sys/kern/kern_clock.c) which always adds one to its result to allow for the current tick to expire. Instead of adding one it should be better testing for a non-zero result. I modified like this my 4.4 kernel long time ago and have seen so far no trouble at all.Same thing more recently with 5.0. >How-To-Repeat: This little program prints the timeout accuracy of the select() system call as "[+-] sec usec".It catches SIGQUIT and polls for console input. It use to give: unpatched (regular) kernel: + 0 s 9938 us --input + 0 s 9959 us + 0 s 9972 us + 0 s 9930 us ^\ --interrupt + 0 s 9954 us + 0 s 9954 us patched kernel: - 0 s 44 us ^\ --interrupt - 0 s 3 us - 0 s 100 us --input - 0 s 44 us - 0 s 47 us -------------------------------8x----------------------------------- #include #include #include #include #include #include #include #include #include fd_set rdfs; int ms; void sig(int unused) {} int main(int argc,char **argv) { int k,mib[2]; struct timeval tv,savtv,otv; struct clockinfo clock; char buf[256]; signal(SIGQUIT,sig); if (argc > 1) ms = atoi(argv[1]); if (!ms) ms = 500; savtv.tv_usec = ms % 1000 * 1000; savtv.tv_sec = ms / 1000; mib[0] = CTL_KERN; mib[1] = KERN_CLOCKRATE; k = sizeof clock; if (sysctl(mib,2,&clock,&k,NULL,0)) err(1,"sysctl"); savtv.tv_usec -= savtv.tv_usec % clock.tick; redo: tv = savtv; gettimeofday(&otv,NULL); /* a quick fix is to substract one tick from tv here but this will break things when compiled on other OS ( linux for instance ) */ for(;;) { FD_SET(0,&rdfs); k = select(1,&rdfs,NULL,NULL,&tv); gettimeofday(&tv,NULL); timersub(&tv,&otv,&tv); switch(k) { case 0: if (timercmp(&tv,&savtv,<)) { timersub(&savtv,&tv,&tv);k = '-'; } else { timersub(&tv,&savtv,&tv);k = '+'; } printf("%c%4d s %10d us\n",k,tv.tv_sec,tv.tv_usec); goto redo; case -1: if (errno != EINTR) err(1,"select"); printf("\t--interrupt\n"); break; default: read(0,buf,256); printf("\t--input\n"); break; } timersub(&savtv,&tv,&tv); if (tv.tv_sec < 0) timerclear(&tv); } } -------------------------------8x----------------------------------- >Fix: patch to kern_clock.c and other places which assume that tvtohz adds one follows. notes: tvtohz has been renamed to hzto in NetBSD but is the same. of course one can write ticks = (sec * hz) + (((unsigned long)usec + (tick - 1)) / tick) ?: 1; but this is a GCC extension. -------------------------------8x----------------------------------- diff -Naur osrc/sys/kern/kern_clock.c src/sys/kern/kern_clock.c --- osrc/sys/kern/kern_clock.c Tue Nov 19 23:58:52 2002 +++ src/sys/kern/kern_clock.c Mon Dec 2 23:47:36 2002 @@ -257,8 +257,7 @@ * If the number of usecs in the whole seconds part of the time * difference fits in a long, then the total number of usecs will * fit in an unsigned long. Compute the total and convert it to - * ticks, rounding up and adding 1 to allow for the current tick - * to expire. Rounding also depends on unsigned long arithmetic + * ticks. Rounding also depends on unsigned long arithmetic * to avoid overflow. * * Otherwise, if the number of ticks in the whole seconds part of @@ -291,14 +290,15 @@ ticks = 1; } else if (sec <= LONG_MAX / 1000000) ticks = (sec * 1000000 + (unsigned long)usec + (tick - 1)) - / tick + 1; + / tick; else if (sec <= LONG_MAX / hz) - ticks = sec * hz - + ((unsigned long)usec + (tick - 1)) / tick + 1; + ticks = sec * hz + ((unsigned long)usec + (tick - 1)) / tick; else ticks = LONG_MAX; if (ticks > INT_MAX) ticks = INT_MAX; + else if (ticks == 0) + ticks = 1; return ((int)ticks); } diff -Naur osrc/sys/kern/kern_time.c src/sys/kern/kern_time.c --- osrc/sys/kern/kern_time.c Tue Dec 3 00:14:05 2002 +++ src/sys/kern/kern_time.c Mon Dec 2 23:33:04 2002 @@ -527,10 +527,6 @@ * Else compute next time timer should go off which is > current time. * This is where delay in processing this timeout causes multiple * SIGALRM calls to be compressed into one. - * tvtohz() always adds 1 to allow for the time until the next clock - * interrupt being strictly less than 1 clock tick, but we don't want - * that here since we want to appear to be in sync with the clock - * interrupt even when we're delayed. */ void realitexpire(void *arg) @@ -555,7 +551,7 @@ if (timevalcmp(&p->p_realtimer.it_value, &ctv, >)) { ntv = p->p_realtimer.it_value; timevalsub(&ntv, &ctv); - callout_reset(&p->p_itcallout, tvtohz(&ntv) - 1, + callout_reset(&p->p_itcallout, tvtohz(&ntv), realitexpire, p); splx(s); PROC_UNLOCK(p); diff -Naur osrc/sys/kern/kern_timeout.c src/sys/kern/kern_timeout.c --- osrc/sys/kern/kern_timeout.c Thu Sep 5 14:42:03 2002 +++ src/sys/kern/kern_timeout.c Mon Dec 2 23:56:45 2002 @@ -399,15 +399,17 @@ return; else if (time_change->tv_sec <= LONG_MAX / 1000000) delta_ticks = (time_change->tv_sec * 1000000 + - time_change->tv_usec + (tick - 1)) / tick + 1; + time_change->tv_usec + (tick - 1)) / tick; else if (time_change->tv_sec <= LONG_MAX / hz) delta_ticks = time_change->tv_sec * hz + - (time_change->tv_usec + (tick - 1)) / tick + 1; + (time_change->tv_usec + (tick - 1)) / tick; else delta_ticks = LONG_MAX; if (delta_ticks > INT_MAX) delta_ticks = INT_MAX; + else if (delta_ticks == 0) + delta_ticks = 1; /* * Now rip through the timer calltodo list looking for timers diff -Naur osrc/sys/net/bpf.c src/sys/net/bpf.c --- osrc/sys/net/bpf.c Tue Nov 19 23:58:58 2002 +++ src/sys/net/bpf.c Mon Dec 2 23:58:24 2002 @@ -774,12 +774,8 @@ { struct timeval *tv = (struct timeval *)addr; - /* - * Subtract 1 tick from tvtohz() since this isn't - * a one-shot timer. - */ if ((error = itimerfix(tv)) == 0) - d->bd_rtout = tvtohz(tv) - 1; + d->bd_rtout = tvtohz(tv); break; } -------------------------------8x----------------------------------- >Release-Note: >Audit-Trail: >Unformatted: To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message