Date: Mon, 5 Feb 2001 01:45:08 +1100 (EST) From: Bruce Evans <bde@zeta.org.au> To: Cejka Rudolf <cejkar@dcse.fee.vutbr.cz> Cc: Sheldon Hearn <sheldonh@uunet.co.za>, freebsd-current@FreeBSD.ORG Subject: Re: Does task scheduler work correctly? (... nice bug fix) Message-ID: <Pine.BSF.4.21.0102042312190.9576-100000@besplex.bde.org> In-Reply-To: <20010201201413.A55503@dcse.fee.vutbr.cz>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 1 Feb 2001, Cejka Rudolf wrote: > Sheldon Hearn wrote (2001/02/01): > > What I remember of the discussions that surrounded this one, your > > summary is correct. The only thing is that nice isn't so much _broken_ > > as it just isn't doing what you'd expect it to. :-) > > Ok, scheduler in -current is not broken. But I'm afraid that in -stable Nah, it is broken. > it is - can (niced) process cause a lock of machine?. Currently, we > have dual processor box with 4.2-STABLE and it silently locks too often. > With current scheduler in mind, it is hard to say if I should search in > HW or SW for a potential fix... I think only processes with niceness less than -5 can cause lockups. This is because they get assigned priorities below some kernel priorities, so priority inversion is possible. I think priority inversions between processes running in user mode are impossible (because low-priority (higher priority numbered) processes can't block in user mode). > > I don't think any of the FreeBSD manual pages suggest that nice 20 > > processes aren't supposed to get _any_ CPU time. > > Maybe. But there are some conventions and two-process sensitivity > 2.5:1 is not very big (is low). Solaris and Linux have much bigger > ratio (sufficient). So why FreeBSD has to be so much different? Just a bug. The relative weights of the priorities are mostly determined accidentally by the following formula in resetpriority(): newpriority = PUSER + p->p_estcpu / INVERSE_ESTCPU_WEIGHT + NICE_WEIGHT * (p->p_nice - PRIO_MIN); For process mixes consisting of only long-running cpu hog processes, the priorities tend to a common limit (this is fairly obvious -- lower priority processes get run more; this increases their priority until it is not lower, so the priority of all the processes is similar; also, correct design of the algorithm that decays p_estcpu should ensure that the common priority doesn't hunt). For the simple case of 2 long-running cpu hog processes with nicenesses n1 and n2, the priority for both tends towards a limit of approximately 104 (see ps output). This corresponds to the (p_estcpu / weight) term for the process with the lowest nice value tending towards its maximum value of (PRIO_MAX - PRIO_MIN - PPQ == 36); the maximum is nearly reached because of correct design of of the decay algorithm, and actually reached because of bugs). The p_estcpu's for the two processes can be calculated from this: 104 = 48 + p->p_estcpu / 8 + 1 * (p->p_nice + 20) p->p_estcpu = (36 - p->p_nice) * 8 In the limit, the runtimes are to a first approximation proportional to p_estcpu (higher terms depend on the decay algorithm), so for processes with niceness n0 and n1, the relative runtimes are approximately: (36 - n0) : (36 - n1) This formula is neither simple nor good. For n0 = 0 and n1 = 20, it gives the ratio of 36:16, which is close to the 2.5:1 ratio observed by Cejka. I observe a ratio closer to 3:1 for the runtimes, and the formula gives almost the exact ratio for the p_estcpu's. The formula breaks down near (n0 = -16, n1 = 20), since the p_estcpu term can never raise the priority by more than 36, so the ratio is 1:0 for n0 < - 16 and n1 = 20. This is a bug. > Insensitivity of nice were problem in the past and it is going back. For RELENG_4 before I imported the fixes from NetBSD, the corresponding calculations are: 127 = 50 + p->p_estcpu / 4 + 2 * p->p_nice p->p_estcpu = (38.5 - p->p_nice) * 8 ratio = (38.5 - n0) : (38.5 - n1) ratio for nicenesses (0, 20) = 38.5:18.5 so -current is almost perfectly bug for bug compatible in this area with RELENG_4! :-( In RELENG_4, the corresponding calculations are: 86 = 50 + p->p_estcpu / 8 + 2 * p->p_nice p->p_estcpu = (18 - p->p_nice) * 16 ratio = (18 - n0) : (18 - n1) ratio for nicenesses (0, 20) = 18:-2 (formula invalid, actually 1:0) ratio for nicenesses (0, 17) = 18:1 Quick fix: --- Index: kern/kern_synch.c =================================================================== RCS file: /home/ncvs/src/sys/kern/kern_synch.c,v retrieving revision 1.124 diff -c -2 -r1.124 kern_synch.c *** kern/kern_synch.c 2001/01/31 04:29:51 1.124 --- kern/kern_synch.c 2001/02/03 15:16:44 *************** *** 1039,1044 **** mtx_enter(&sched_lock, MTX_SPIN); if (p->p_rtprio.type == RTP_PRIO_NORMAL) { ! newpriority = PUSER + p->p_estcpu / INVERSE_ESTCPU_WEIGHT + ! NICE_WEIGHT * (p->p_nice - PRIO_MIN); newpriority = min(newpriority, MAXPRI); p->p_usrpri = newpriority; --- 1039,1046 ---- mtx_enter(&sched_lock, MTX_SPIN); if (p->p_rtprio.type == RTP_PRIO_NORMAL) { ! newpriority = PUSER + ! (PRIO_MAX - PRIO_MIN + 1) * ! (p->p_estcpu / INVERSE_ESTCPU_WEIGHT) / ! (PRIO_MAX - p->p_nice + 1); newpriority = min(newpriority, MAXPRI); p->p_usrpri = newpriority; Index: sys/proc.h =================================================================== RCS file: /home/ncvs/src/sys/sys/proc.h,v retrieving revision 1.145 diff -c -2 -r1.145 proc.h *** sys/proc.h 2001/01/31 04:29:52 1.145 --- sys/proc.h 2001/02/03 15:56:49 *************** *** 493,505 **** * XXX macros for scheduler. Shouldn't be here, but currently needed for * bounding the dubious p_estcpu inheritance in wait1(). * INVERSE_ESTCPU_WEIGHT is only suitable for statclock() frequencies in * the range 100-256 Hz (approximately). */ #define ESTCPULIM(e) \ ! min((e), INVERSE_ESTCPU_WEIGHT * (NICE_WEIGHT * (PRIO_MAX - PRIO_MIN) - \ ! PPQ) + INVERSE_ESTCPU_WEIGHT - 1) ! #define INVERSE_ESTCPU_WEIGHT 8 /* 1 / (priorities per estcpu level). */ ! #define NICE_WEIGHT 1 /* Priorities per nice level. */ ! #define PPQ (128 / NQS) /* Priorities per queue. */ struct mtx; --- 493,505 ---- * XXX macros for scheduler. Shouldn't be here, but currently needed for * bounding the dubious p_estcpu inheritance in wait1(). + * XXX PPQ no longer needed for that. * INVERSE_ESTCPU_WEIGHT is only suitable for statclock() frequencies in * the range 100-256 Hz (approximately). */ #define ESTCPULIM(e) \ ! min((e), (INVERSE_ESTCPU_WEIGHT * (MAXPRI + 1 - PUSER) - 1) * \ ! (PRIO_MAX + 1) / (PRIO_MAX - PRIO_MIN + 1)) ! #define INVERSE_ESTCPU_WEIGHT 16 /* Resol. of estcpu for priorities. */ ! #define PPQ ((MAXPRI + 1) / NQS) /* Priorities per queue. */ struct mtx; --- This works by making the steady-state value of p_estcpu proportional to (PRIO_MAX - p->p_nice + 1) = (21 - p->p_nice). This gives interestingly different ramping up of the priorities: the initial priority is now independent of p_nice, so niced processes aren't completely locked out by short-lived unniced processes. I have a better fix that uses a table of multipliers instead of the divisor (PRIO_MAX - p->p_nice + 1) in the above. It is unfinished in different ways. Correctly limiting p_estcpu and/or scaling is a problem in all versions. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.21.0102042312190.9576-100000>