Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 20 Mar 2004 16:31:39 -0800 (PST)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        freebsd-stable@freebsd.org
Subject:   Re: Urk, I take it back (was Re: Bug in p_estcpu handling onprocess exit in FBsd-4.x)
Message-ID:  <200403210031.i2L0Vdoc096697@apollo.backplane.com>
References:  <200403201941.i2KJf6Ml095658@apollo.backplane.com> <200403202244.i2KMiRth096273@apollo.backplane.com>

next in thread | previous in thread | raw e-mail | index | archive | help
     All right, I figured out a solution.  Basically the solution for the
     4.x scheduler (and the 4BSD scheduler in 5.x for people still using
     it) is to bump the child's estcpu in fork and recover any delta changes
     back to the parent in exit.  The DFly patch set is rather DFly specific,
     so I will just explain it in case someone in FreeBSD land wants to fix
     the problem in FreeBSD-4.

     In sys/proc.h, in the proc structure:

	 u_int   p_estcpu;        /* Time averaged value of p_cpticks. */
ADDME	 u_int   p_estcpu_fork;

     In kern/kern_fork.c, in fork1(), search for 'p_estcpu'.  You will
     find the line:

REMOVEME	 p2->p_estcpu = p1->p_estcpu;

     Replace it with:

ADDME		 p2->p_estcpu_fork = p2->p_estcpu =
ADDME			 ESTCPULIM(p1->p_estcpu + ESTCPURAMP);

     This will initialize a new fork()'d child with an estcpu that gives it
     a slightly more 'batch' priority then its parent.  If the fork()'d 
     child is an interactive process, the normal scheduling mechanisms will
     float estcpu back down.  This prevents new batch children from jerking
     around interactive processes in the first few ticks of their operation,
     and should have no significant effect on interactive children because
     other interactive processes will not be eating all the cpu (so there is
     cpu available), and any pre-existing batch processes will already
     likely have far higher p_estcpu values.

     On the exit side, instead of trying to average the child's estcpu into
     the parent or trying to slap it in (to deal with batch scripts, e.g.
     like make, which do a lot of recursive fork/exec's), just aggregate
     the difference relative to the saved p_estcpu_fork into the parent,
     though only if the child was found to be batch above and beyond the
     p_estcpu[_fork] that was originally assigned to it.  Otherwise the
     parent's estcpu is allowed to stand on its own.  The old FreeBSD code
     would do terrible things to forking servers like sendmail().  The
     new code should work much better.

     In kern/kern_exit.c, in wait1(), replace this:

REMOVEME         /* charge childs scheduling cpu usage to parent */
REMOVEME         if (curproc->p_pid != 1) {
REMOVEME         	curproc->p_estcpu =
REMOVEME         	    ESTCPULIM(curproc->p_estcpu + p->p_estcpu);
                 }

    With this (note that 'q' is the same as 'curproc', so there is no
    reason to reference 'curproc' when we can just use 'q'):

ADDME		/*
ADDME		 * Charge the parent for the child's change in
ADDME		 * estimated cpu as of when the child exits to
ADDME		 * account for batch scripts, large make's, etc.
ADDME		 */
ADDME		if (q->p_pid != 1) {
ADDME		    if (p->p_estcpu > p->p_estcpu_fork) {
ADDME			q->p_estcpu = ESTCPULIM(q->p_estcpu +
ADDME					p->p_estcpu - p->p_estcpu_fork);
ADDME		    }
ADDME		}

    That should do it.  It seems to do very good job in DragonFly.  If
    anyone wants to do the work in FreeBSD I of course recommend that you
    test it, YMMV.
	
					-Matt
					Matthew Dillon 
					<dillon@backplane.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200403210031.i2L0Vdoc096697>