FreeBSD Mail Archives

Date:      Sat, 22 Dec 2001 23:14:04 +1100 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Jake Burkholder <jake@locore.ca>
Cc:        Luigi Rizzo <rizzo@aciri.org>, John Baldwin <jhb@FreeBSD.ORG>, <current@FreeBSD.ORG>, Peter Wemm <peter@wemm.org>
Subject:   Re: vm_zeropage priority problems.
Message-ID:  <20011222213623.J7890-100000@gamplex.bde.org>
In-Reply-To: <20011222031349.B62219@locore.ca>

index | next in thread | previous in thread | raw e-mail

On Sat, 22 Dec 2001, Jake Burkholder wrote:

> Apparently, On Sat, Dec 22, 2001 at 06:48:26PM +1100,
> 	Bruce Evans said words to the effect of;
> > Index: kern_synch.c
> > ===================================================================
> > RCS file: /home/ncvs/src/sys/kern/kern_synch.c,v
> > retrieving revision 1.167
> > diff -u -2 -r1.167 kern_synch.c
> > --- kern_synch.c	18 Dec 2001 00:27:17 -0000	1.167
> > +++ kern_synch.c	19 Dec 2001 16:01:26 -0000
> > @@ -936,18 +1058,18 @@
> >  	struct thread *td;
> >  {
> > -	struct kse *ke = td->td_kse;
> > -	struct ksegrp *kg = td->td_ksegrp;
> > +	struct ksegrp *kg;
> >
> > -	if (td) {
> > -		ke->ke_cpticks++;
> > -		kg->kg_estcpu = ESTCPULIM(kg->kg_estcpu + 1);
> > -		if ((kg->kg_estcpu % INVERSE_ESTCPU_WEIGHT) == 0) {
> > -			resetpriority(td->td_ksegrp);
> > -			if (kg->kg_pri.pri_level >= PUSER)
> > -				kg->kg_pri.pri_level = kg->kg_pri.pri_user;
> > -		}
> > -	} else {
> > +	if (td == NULL)
> >  		panic("schedclock");
> > -	}
> > +	td->td_kse->ke_cpticks++;
> > +	kg = td->td_ksegrp;
> > +#ifdef NEW_SCHED
> > +	kg->kg_estcpu += niceweights[kg->kg_nice - PRIO_MIN];
> > +#else
> > +	kg->kg_estcpu++;
> > +#endif
> > +	resetpriority(kg);
> > +	if (kg->kg_pri.pri_level >= PUSER)
> > +		kg->kg_pri.pri_level = kg->kg_pri.pri_user;
> >  }
>
> I'm curious why you removed the ESTCPULIM and INVERSE_ESTCPU_WEIGHT
> calculations even in the OLD_SCHED case.  Do these turn out to have
> no effect in general?

ESTCPULIM basically breaks scheduling if it is are hit (clipping to it
prevents accumulation of hog points that would cause cpu hogs to be
run less).  This is a problem in practice.  I use dynamic limits even
in the !NEW_SCHED case.  I forgot that I did this or I would have included
more context to show it (see below).  kg->kg_estcpu is allowed to grow
without explicit limit and scaled to fit in the priority range.  This
requires fixing sorcerer's-apprentice growth of kg_estcpu in fork()
and exit().  kg_estcpu has natural limits but they are quite large
(a constant multiple of the load average).

INVERSE_ESTCPU_WEIGHT is not used because it goes with static scaling,
and "% INVERSE_ESTCPU_WEIGHT" optimization (which depends on the internals
of resetpriority()) is not so easy to do.

Here are the corresponding changes for resetpriority():

%%%
Index: kern_synch.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/kern_synch.c,v
retrieving revision 1.167
diff -u -2 -r1.167 kern_synch.c
--- kern_synch.c	18 Dec 2001 00:27:17 -0000	1.167
+++ kern_synch.c	22 Dec 2001 07:34:15 -0000
@@ -844,15 +949,32 @@
 	register struct ksegrp *kg;
 {
+	u_int estcpu;
 	register unsigned int newpriority;

 	mtx_lock_spin(&sched_lock);
 	if (kg->kg_pri.pri_class == PRI_TIMESHARE) {
-		newpriority = PUSER + kg->kg_estcpu / INVERSE_ESTCPU_WEIGHT +
+		estcpu = kg->kg_estcpu;
+		if (estcpu > estcpumax)
+			estcpu = estcpumax;
+#ifdef NEW_SCHED
+		newpriority = PUSER +
+		    (((u_int64_t)estcpu * estcpumul) >> ESTCPU_SHIFT);
+#else
+		newpriority = PUSER +
+		    (((u_int64_t)estcpu * estcpumul) >> ESTCPU_SHIFT) +
 		    NICE_WEIGHT * (kg->kg_nice - PRIO_MIN);
-		newpriority = min(max(newpriority, PRI_MIN_TIMESHARE),
-		    PRI_MAX_TIMESHARE);
+#endif
+		if (newpriority < PUSER)
+			newpriority = PUSER;
+		if (newpriority > PRI_MAX_TIMESHARE) {
+			Debugger("newpriority botch");
+			newpriority = PRI_MAX_TIMESHARE;
+		}
 		kg->kg_pri.pri_user = newpriority;
-	}
-	maybe_resched(kg);
+		maybe_resched(kg, newpriority);
+	} else
+		/* XXX doing anything here is dubious. */
+		/* XXX was: need_resched(). */
+		maybe_resched(kg, kg->kg_pri.pri_user);
 	mtx_unlock_spin(&sched_lock);
 }
%%%

> > Most of the changes here are to fix style bugs.  In the NEW_SCHED case,
> > the relative weights for each priority are determined by the niceweights[]
> > table.  kg->kg_estcpu is limited only by INT_MAX and priorities are
> > assigned according to relative values of kg->kg_estcpu (code for this is
> > not shown).  The NEW_SCHED case has not been tried since before SMPng
> > broke scheduling some more by compressing the priority ranges.
>
> It is relatively easy to uncompress the priority ranges if that is
> desirable.  What range is best?

The original algorithm works best with something close to the old range
of 50-127 (PUSER = 50, MAXPRI = 127) for positively niced processes alone.
This gives unniced processes a priority range of 50-127 and permits
nice -20'ed processes to have a much larger (numerically) base priority
than unniced ones while still allowing room for their priority to grow
(range 90-127).  Negatively niced processes were handled dubiously at
best (they ran into the kernel priorities).  Brian Feldman reduced the
priority range for unniced processes to 68-127 and you reduced it some
more to 180-223.

The main problem with the reduced rangesis that the algorithm gives
approximately an exponential dependency of the cpu cycles allocated
to a process on the process's niceness.  The base for the exponential
is invisible and hard to change, so decreasing the range by a factor
of 78/44 significantly reduces the effects of niceness.  I think my
nicewights[] algorithm can handle this.  It supports almost any dependency
of cycles on niceness.  However, I don't know how it can be made to
work right for the entire priority range.  An exponential dependency
would grow too fast for the range 0-255 if it grows fast enough for
the user range 180-233.

I used the following program to generate (old) niceweights[] tables.
Defining EXP gives an exponential table with niceness 0 haveing 32
times as much priority as niceness 20.  The default approximates the
old -current behaviour (which isn't actually exponential).

%%%
#include <math.h>

main()
{
	int i;

	for (i = 0; i <= 40; i++) {
		if (i % 8 == 0)
			printf("\t");
#ifdef EXP
		printf("%d,", (int)floor(2 * 3 * pow(2.0, i / 4.0) + 0.5));
#else
		if (i == 40)
			printf("65536\n");
		else
			printf("%d,", 2 * 2 * 2 * 3 * 3 * 5 * 7 / (40 - i));
#endif
		if (i % 8 == 7)
			printf("\n");
		else
			printf(" ");
	}
}
%%%

Bruce

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message

home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20011222213623.J7890-100000>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation