Date: Fri, 7 May 2004 17:46:19 +1000 (EST) From: Bruce Evans <bde@zeta.org.au> To: David Schultz <das@freebsd.org> Cc: "P.D. Seniura" <pdseniura@techie.com> Subject: Re: low HZ value causes "Time Warp Bug" (re: this Puny Pentium2 suddenly became 45% slower!) Message-ID: <20040507165839.Q24428@gamplex.bde.org> In-Reply-To: <20040507040852.GA78023@VARK.homeunix.com> References: <20040507005518.75B6A79004C@ws1-14.us4.outblaze.com> <20040507040852.GA78023@VARK.homeunix.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 6 May 2004, David Schultz wrote: > On Thu, May 06, 2004, P.D. Seniura wrote: > > > > > > It seems this bug happens when the HZ value goes below 16 > > > > (either by compiling 'options HZ=' in kernel or setting > > > > sysctl 'kern.hz=' in /boot/loader.conf). The computed > > > > 'ticks' value becomes too large for 2-byte int producing > > > > crazy overflowed numbers elsewhere. > > > > > > 16 is pretty low.. > > > Then again it would be nice if it warned you or something similar when you > > > tried it :) Nah, INT_MIN would be low. Values between INT_MIN and -1 might cause even more interesting behaviour. The value of 0 would cause the not so interesting behaviour of a panic for division by 0 in init_param1() if not earlier. Nonexistent bounds checking for hz is just one of thousands of cases of nonexistent bounds checking for tunables and sysctls. The kernel trusts the (privileged) user not to set values that don't work. WHere is the 2-byte int that overflows? The kernel mostly uses "int ticks = 1000000 / hz". It assumes at least 32-bit ints or that hz > 2. This will work until hz becomes larger tha 1000000 or not nearly a divisor of 1000000. > > > > Heh, I got HZ set to 20 while it does > > buildworld (~9 hours) and portupgrade overnight. > > The idea is "less slicing and more doing". ;) > > Umm...yeah, don't do that. For one, 1/(100 Hz) = 10000 us = 4.5 > million cycles on your processor, which is an eternity in computer > time. For two, HZ doesn't affect the maximum timeslice processes > get. The scheduling quantum is fixed at 100 ms in the 4BSD > scheduler, and it varies between 10 ms and 143 ms in ULE. Actually, it is supposed to be non-fixed at the value set by the kern.quantum sysctl in the 4BSD scheduler, but this was broken a few years ago: %%% ---------------------------- revision 1.156 date: 2001/02/12 00:20:05; author: jake; state: Exp; lines: +4 -3 ... - Remove the curpriority global variable and use that of curproc. This was used to detect when a process' priority had lowered and it should yield. We now effectively yield on every interrupt. ... %%% This commit didn't break the quantum; that was done a year or so earlier by yielding on every interrupt (actually only on non-fast interrupts, so the quantum might still work if there were only clock interrupts and curpriority and related things had not rotted). The yielding is just be switching to interrupt threads. On switching back, at least the 4BSD scheduler doesn't really know what the interrupted thread or its quantum was. It just picks the highest priority runnable thread, and that is never the interrupted thread if there are multiple threads with the same priority, since yielding puts the interrupted thead on the tail of the queue. Scheduling is thus reduced to essentially round-robin among threads with the same priority, with a variable quantum of <time until next non-fast interrupt>. Prioritization still works right in most cases and gives good scheduling for long-lived processes. The importance of scheduling is shown by the number of users who notice when it is broken: it is very small. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040507165839.Q24428>