From owner-freebsd-stable@FreeBSD.ORG Sat Mar 20 16:31:40 2004 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 778A716A4CE for ; Sat, 20 Mar 2004 16:31:40 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5B8E643D39 for ; Sat, 20 Mar 2004 16:31:40 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) i2L0Vd82096698 for ; Sat, 20 Mar 2004 16:31:40 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.9p2/8.12.9/Submit) id i2L0Vdoc096697; Sat, 20 Mar 2004 16:31:39 -0800 (PST) (envelope-from dillon) Date: Sat, 20 Mar 2004 16:31:39 -0800 (PST) From: Matthew Dillon Message-Id: <200403210031.i2L0Vdoc096697@apollo.backplane.com> To: freebsd-stable@freebsd.org References: <200403201941.i2KJf6Ml095658@apollo.backplane.com> <200403202244.i2KMiRth096273@apollo.backplane.com> Subject: Re: Urk, I take it back (was Re: Bug in p_estcpu handling onprocess exit in FBsd-4.x) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 21 Mar 2004 00:31:40 -0000 All right, I figured out a solution. Basically the solution for the 4.x scheduler (and the 4BSD scheduler in 5.x for people still using it) is to bump the child's estcpu in fork and recover any delta changes back to the parent in exit. The DFly patch set is rather DFly specific, so I will just explain it in case someone in FreeBSD land wants to fix the problem in FreeBSD-4. In sys/proc.h, in the proc structure: u_int p_estcpu; /* Time averaged value of p_cpticks. */ ADDME u_int p_estcpu_fork; In kern/kern_fork.c, in fork1(), search for 'p_estcpu'. You will find the line: REMOVEME p2->p_estcpu = p1->p_estcpu; Replace it with: ADDME p2->p_estcpu_fork = p2->p_estcpu = ADDME ESTCPULIM(p1->p_estcpu + ESTCPURAMP); This will initialize a new fork()'d child with an estcpu that gives it a slightly more 'batch' priority then its parent. If the fork()'d child is an interactive process, the normal scheduling mechanisms will float estcpu back down. This prevents new batch children from jerking around interactive processes in the first few ticks of their operation, and should have no significant effect on interactive children because other interactive processes will not be eating all the cpu (so there is cpu available), and any pre-existing batch processes will already likely have far higher p_estcpu values. On the exit side, instead of trying to average the child's estcpu into the parent or trying to slap it in (to deal with batch scripts, e.g. like make, which do a lot of recursive fork/exec's), just aggregate the difference relative to the saved p_estcpu_fork into the parent, though only if the child was found to be batch above and beyond the p_estcpu[_fork] that was originally assigned to it. Otherwise the parent's estcpu is allowed to stand on its own. The old FreeBSD code would do terrible things to forking servers like sendmail(). The new code should work much better. In kern/kern_exit.c, in wait1(), replace this: REMOVEME /* charge childs scheduling cpu usage to parent */ REMOVEME if (curproc->p_pid != 1) { REMOVEME curproc->p_estcpu = REMOVEME ESTCPULIM(curproc->p_estcpu + p->p_estcpu); } With this (note that 'q' is the same as 'curproc', so there is no reason to reference 'curproc' when we can just use 'q'): ADDME /* ADDME * Charge the parent for the child's change in ADDME * estimated cpu as of when the child exits to ADDME * account for batch scripts, large make's, etc. ADDME */ ADDME if (q->p_pid != 1) { ADDME if (p->p_estcpu > p->p_estcpu_fork) { ADDME q->p_estcpu = ESTCPULIM(q->p_estcpu + ADDME p->p_estcpu - p->p_estcpu_fork); ADDME } ADDME } That should do it. It seems to do very good job in DragonFly. If anyone wants to do the work in FreeBSD I of course recommend that you test it, YMMV. -Matt Matthew Dillon