From owner-cvs-all@FreeBSD.ORG Wed Jun 23 08:12:37 2004 Return-Path: Delivered-To: cvs-all@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9E57516A4D0; Wed, 23 Jun 2004 08:12:37 +0000 (GMT) Received: from mailout1.pacific.net.au (mailout1.pacific.net.au [61.8.0.84]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2364C43D48; Wed, 23 Jun 2004 08:12:37 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailproxy1.pacific.net.au (mailproxy1.pacific.net.au [61.8.0.86])i5N8CL4u009349; Wed, 23 Jun 2004 18:12:21 +1000 Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) i5N8CIao032004; Wed, 23 Jun 2004 18:12:19 +1000 Date: Wed, 23 Jun 2004 18:12:17 +1000 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Julian Elischer In-Reply-To: Message-ID: <20040623172902.C57766@gamplex.bde.org> References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: cvs-src@freebsd.org cc: src-committers@freebsd.org cc: David Xu cc: cvs-all@freebsd.org cc: Bruce Evans Subject: Re: cvs commit: src/sys/kern kern_exit.c X-BeenThere: cvs-all@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: CVS commit messages for the entire tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jun 2004 08:12:37 -0000 On Tue, 22 Jun 2004, Julian Elischer wrote: > On Wed, 23 Jun 2004, David Xu wrote: > > > Bruce Evans wrote: > > >bde 2004-06-21 14:49:50 UTC > > > > > > FreeBSD src repository > > > > > > Modified files: > > > sys/kern kern_exit.c > > > Log: > > > (1) Removed the bogus condition "p->p_pid != 1" on calling >sched_exit() > > > from exit1(). sched_exit() must be called unconditionally from > > >exit1(). > > > It was called almost unconditionally because the only exits on > > >system > > > shutdown if at all. > > > > > > (2) Removed the comment that presumed to know what sched_exit() does. > > > sched_exit() does different things for the ULE case. The call > > >became > > > essential when it started doing load average stuff, but its caller > > > should not know that. > > > > But this change loses a semantic, in most time, init is waitting there > > to recycle runnaway processes, those process were not created by init, > > if you call sched_exit for init unconditionally, the runnaway process's > > cpu usage are all merged into init, this is unfair for init, is there Er, the unquoted clause (3) in the commit log says this and more. Priority merging is unfair to all parent processes. It's more of a problem for shells. > > any benifit to lower init's priority under load to slow down recycling > > speed ? I don't think so. I think scheduler's sched_exit should be > > fixed at same time to keep this semantic. I may fix this. I used to just remove the cpu merging in exit and cpu inheritance in fork. It think was originally to limit creation of new processes for a special application (wcarchive forking ftpd's). It worked too well to limit creation of new processes in general. When it was committed, there was no ESTCPULIM to limit growth of cpu. fork/exec grows the cpu in a fake way, so it wants ti be exponential in the number of children and could grow to 2^30 after just 30 fork+execs and then overflow to 2^31 on the next one. Once it got to a few hundred, it gave maximal (numeric) priority so processes tended not to run; however if they did manage to fork-exec a few more times, their cpu could reach 2^30 and then it took a _long_ time for it to decay back below a few hundred so that the process could run again in competition with processes with normal cpu/priority . This caused mysterious multi-second hangs in shells. I've used more limited cpu merging since KSE made it clear that some sort of cpu inheritance and merging is right: %%% Index: sched_4bsd.c =================================================================== RCS file: /home/ncvs/src/sys/kern/sched_4bsd.c,v retrieving revision 1.41 diff -u -2 -r1.41 sched_4bsd.c --- sched_4bsd.c 21 Jun 2004 23:47:47 -0000 1.41 +++ sched_4bsd.c 23 Jun 2004 06:28:01 -0000 @@ -550,9 +662,20 @@ void -sched_exit_ksegrp(struct ksegrp *kg, struct ksegrp *child) +sched_exit_ksegrp(struct ksegrp *parent, struct ksegrp *child) { mtx_assert(&sched_lock, MA_OWNED); - kg->kg_estcpu = ESTCPULIM(kg->kg_estcpu + child->kg_estcpu); + /* + * XXX adding all of the child's cpu to the parent's like we used to + * do would be wrong, since we duplicate the parent's cpu at fork + * time so adding it all back would give exponential growth. In + * practice, the growth would have been limited by ESTCPULIM, but that + * would be wrong too since it is very nonlinear. Splitting the cpu + * at fork time would be better, but adding it all back here would + * still give nonlinearities since multiple processes tend to + * accumulate more cpu than single ones. + */ + if (parent->kg_estcpu < child->kg_estcpu) + parent->kg_estcpu = child->kg_estcpu; } %%% My 4BSD scheduler needs to limit growth of fake cpu somewhere because it lets non-fake cpu grow without bound (except for natural bounds given by actual cpu use and cpu decay). This is to fix breakdown of the decay algorithm by clamping growth with ESTCPULIM(). > exaclty.. > > Actually this doesn't CHANGE anything because "p->p_pid != 1 > was ALWAYS TRUE. The problem is apparently unimportant, because it was only noticed by code inspection. ESTCPULIM() limits it in the same way as shells, and init doesn't run much so its priority soon decays. It obviously isn't important for init to have a higher priority than most processes, else it would be negatively niced. Bruce