Date: Sat, 13 Mar 1999 10:00:49 -0700 From: "Justin T. Gibbs" <gibbs@plutotech.com> To: Greg Black <gjb@comkey.com.au> Cc: Thomas Schuerger <schuerge@wurzelausix.CS.Uni-SB.DE>, freebsd-questions@FreeBSD.ORG, freebsd-bugs@FreeBSD.ORG Subject: Re: Scheduling bug? Message-ID: <199903131709.KAA05189@pluto.plutotech.com> In-Reply-To: Your message of "Sat, 13 Mar 1999 13:51:57 %2B1000." <19990313035157.4099.qmail@alpha.comkey.com.au>
next in thread | previous in thread | raw e-mail | index | archive | help
You may want to take a look at the changes recently made in NetBSD to address these kinds of issues. Here's the text explaining the bulk of the change: Committed By: ross Date: Tue Feb 23 02:56:04 UTC 1999 Modified Files: src/sys/kern: kern_clock.c kern_synch.c src/sys/arch/alpha/alpha: interrupt.c Added Files: src/sys/sys: sched.h Log Message: Scheduler bug fixes and reorganization * fix the ancient nice(1) bug, where nice +20 processes incorrectly steal 10 - 20% of the CPU, (or even more depending on load average) * provide a new schedclk() mechanism at a new clock at schedhz, so high platform hz values don't cause nice +0 processes to look like they are niced * change the algorithm slightly, and reorganize the code a lot * fix percent-CPU calculation bugs, and eliminate some no-op code === nice bug === Correctly divide the scheduler queues between niced and compute-bound processes. The current nice weight of two (sort of, see `algorithm change' below) neatly divides the USRPRI queues in half; this should have been used to clip p_estcpu, instead of UCHAR_MAX. Besides being the wrong amount, clipping an unsigned char to UCHAR_MAX is a no-op, and it was done after decay_cpu() which can only _reduce_ the value. It has to be kept <= NICE_WEIGHT * PRIO_MAX - PPQ or processes can scheduler-penalize themselves onto the same queue as nice +20 processes. (Or even a higher one.) === New schedclk() mechansism === Some platforms should be cutting down stathz before hitting the scheduler, since the scheduler algorithm only works right in the vicinity of 64 Hz. Rather than prescale hz, then scale back and forth by 4 every time p_estcpu is touched (each occurance an abstraction violation), use p_estcpu without scaling and require schedhz to be generated directly at the right frequency. Use a default stathz (well, actually, profhz) / 4, so nothing changes unless a platform defines schedhz and a new clock. Define these for alpha, where hz==1024, and nice was totally broke. === Algorithm change === The nice value used to be added to the exponentially-decayed scheduler history value p_estcpu, in _addition_ to be incorporated directly (with greater wieght) into the priority calculation. At first glance, it appears to be a pointless increase of 1/8 the nice effect (pri = p_estcpu/4 + nice*2), but it's actually at least 3x that because it will ramp up linearly but be decayed only exponentially, thus converging to an additional .75 nice for a loadaverage of one. I killed this, it makes the behavior hard to control, almost impossible to analyze, and the effect (~~nothing at for the first second, then somewhat increased niceness after three seconds or more, depending on load average) pointless. === Other bugs === hz -> profhz in the p_pctcpu = f(p_cpticks) calcuation. Collect scheduler functionality. Try to put each abstraction in just one place. -- Justin To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199903131709.KAA05189>