Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 10 May 2004 17:17:23 +1000 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Julian Elischer <julian@elischer.org>
Cc:        FreeBSD current users <current@freebsd.org>
Subject:   Re: exit1()/scheduler question.. possible typo?
Message-ID:  <20040510162831.X1105@gamplex.bde.org>
In-Reply-To: <Pine.BSF.4.21.0405091556030.24403-100000@InterJet.elischer.org>
References:  <Pine.BSF.4.21.0405091556030.24403-100000@InterJet.elischer.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 9 May 2004, Julian Elischer wrote:

> in exit1 there is an assertion to test that the exiting process
> is not init. (proc 1 by tradition..)
>
> yes later, nearly at the end we see:
>
>         /*
>          * Allow the scheduler to adjust the priority of the
>          * parent when a kseg is exiting.
>          */
>         if (p->p_pid != 1)
>                 sched_exit(p->p_pptr, td);
>
>
> firstly, the comment is wrong but, the question comes..
> "if init can not get here then why have the test?"

This code rotted from differently broken code in wait1().  In RELENG_4,
the corresponding code is:

% 		if (p->p_stat == SZOMB) {
% 			/* charge childs scheduling cpu usage to parent */
% 			if (curproc->p_pid != 1) {
% 				curproc->p_estcpu =
% 				    ESTCPULIM(curproc->p_estcpu + p->p_estcpu);
% 			}

Here there are 2 processes involved (the parent (curproc) doing the
wait, and the child (p) being reaped), and the check for the parent
is correct.  It is needed to avoid clobbering init's estcpu by adding
the child's estcpu to it.

[Bugs in this code begin with English usage errors in the comment and
end with algorithmic brokenness.  Adding the child's estcpu gives an
estcpu and thus a priority that wants to be exponential in the number
of children, but the clamps on estcpu and the priority prevent complete
brokenness.  See revs.1.4 and 1.6 of kern_exit.c and associated changes
in kern_fork.c for the initial reimplementation of this bug (it was
first implemented in FreeBSD-1, and caused mysterious shell hangs
esprecially before rev.1.6).  This is now handled better by SCHED_ULE
in -current and by SCHED_4BSD in my version of -current.]

Anyway, the code has moved from wait1() to exit1() and sched_exit().
The SCHED_4BSD version of sched_exit() + sched_exit_ksegrp() is still
similar.  It even preseves most of the English usage errors in the
comment.  The move to exit1() is dubious.  A special case for the init
parent still seems like a good idea, but the parent is not known known
until wait() and often changes to init after exit().  However, this
part of the bug is harmless provided p->p_pptr is locked down (which
it doesn't seem to be).  We can just accumulate estcpu (or something
different for SCHED_ULE) in the current parent, and then throw it
away (or otherwise special-case it) when the parent (or another
ancestor) is reaped by init.

> I get the impression that possibly this should be p->p_pptr->p_ppid
                                                               p_pid
> but I don't know enough about ULE to know if that makes sense.
>
> (maybe it should be (p->p_pptr != initproc)

My version already used initproc here and elsewhere, but was missing the
fix for the LHS.

The call to sched_exit() was moved from wait1() to exit1() without special
mention in:

% RCS file: /home/ncvs/src/sys/kern/kern_exit.c,v
% Working file: kern_exit.c
% head: 1.229
% ...
% ----------------------------
% revision 1.209
% date: 2003/04/11 03:39:07;  author: jeff;  state: Exp;  lines: +7 -11
%  - Adjust sched hooks for fork and exec to take processes as arguments instead
                                     exit (?)
%    of ksegs since they primarily operation on processes.
%  - KSEs take ticks so pass the kse through sched_clock().
%  - Add a sched_class() routine that adjusts a ksegrp pri class.
%  - Define a sched_fork_{kse,thread,ksegrp} and sched_exit_{kse,thread,ksegrp}
%    that will be used to tell the scheduler about new instances of these
%    structures within the same process.  These will be used by THR and KSE.
%  - Change sched_4bsd to reflect this API update.
% ----------------------------

The move seems to be just a cleanup/optimization/bug.  It makes the
sched* call match the caller's name, avoids some locking, and adds the
bug.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040510162831.X1105>