Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 31 Oct 2004 23:15:43 +1100 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        "J. Porter Clark" <jpc@drum.msfc.nasa.gov>
Cc:        freebsd-i386@FreeBSD.org
Subject:   Re: i386/73328: top shows NICE as -111 on processes started by idprio
Message-ID:  <20041031223051.R15841@delplex.bde.org>
In-Reply-To: <200410302307.i9UN7Cg0045288@www.freebsd.org>
References:  <200410302307.i9UN7Cg0045288@www.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 30 Oct 2004, J. Porter Clark wrote:

> >Description:
>       The "top" program shows a NICE value of -111 for programs started by
> idprio 31.  On 4.X, the "correct" value of 52 is shown.
> >How-To-Repeat:
>       As root, run "idprio 31 sleep 500 &".
> Then run "top -Uroot".  The sleep process is listed as having a NICE value
> of -111.  Other values besides 31 produce the same result.
> >Fix:
>       My guess is that it's probably in /usr/src/usr.bin/top/machine.c about line 747 or so, but I haven't had time to dig into it.

I use this fix.  It may be out of date, and the comments about the "base"
priority are too verbose and not quite right.

%%%
Index: machine.c
===================================================================
RCS file: /home/ncvs/src/usr.bin/top/machine.c,v
retrieving revision 1.51
diff -u -2 -r1.51 machine.c
--- machine.c	6 Jun 2004 19:59:06 -0000	1.51
+++ machine.c	7 Jun 2004 04:37:20 -0000
@@ -573,18 +573,67 @@
 	    smpmode ? smp_Proc_format : up_Proc_format,
 	    pp->ki_pid,
-	    namelength, namelength,
-	    (*get_userid)(pp->ki_ruid),
-	    pp->ki_pri.pri_level - PZERO,
-
-	    /*
-	     * normal time      -> nice value -20 - +20
-	     * real time 0 - 31 -> nice value -52 - -21
-	     * idle time 0 - 31 -> nice value +21 - +52
+	    namelength, namelength, (*get_userid)(pp->ki_ruid),
+  	    pp->ki_pri.pri_level - PZERO,
+	    /*-
+	     * Mapping from various disorganized priority schemes to ordered
+	     * pseudo-nice values:
+	     *
+	     * interrupt thread        base pri  0 - 63   -> nice -180 - -117
+	     * top half kernel thread  base pri  64 - 127 -> nice -116 -  -53
+	     * realtime user threads   rtprio    0 - 31   -> nice  -52 -  -21
+	     * normal user threads     nice     -20 - +20 -> nice  -20 -  +20
+	     * idle user threads       idprio    0 - 31   -> nice  +21 -  +52
+	     *
+	     * The number of interest is really the "base" priority of the
+	     * process, not the niceness of the process directly.  The base
+	     * priority should be what is is td->td_base_pri in the kernel,
+	     * which is ki_pri.pri_native here.  In practice, that can't
+	     * be used directly and the workarounds are complicated because
+	     * of the following bugs:
+	     *     o td->td_base_pri is changed by priority propagation and
+	     *       not even restored.  Thus it cannot be used to determine
+	     *       the priority class.  The other priorities in k_pri can
+	     *       be used for this, but they are set inconsistently too so
+	     *       there is no one place that determines the correct base
+	     *       priority.
+	     *     o td->td_base_pri is not set to a useful value for normal
+	     *       user threads.  It is initialized to 0 and only changed
+	     *       by priority propagation.  Workaround: use the actual
+	     *       nice value for the "base priority" of normal user
+	     *       threads.
+	     *     o kg->kg_user_pri (pri_user here) is not set to a useful
+	     *       value for kernel threads.  It is initialized to PUSER
+	     *       and never changed.  Something like it should be used
+	     *       for all classes of threads to hold the previous priority
+	     *       during priority propagation.  Then there might not need
+	     *       to be a special variable for the user -> kernel
+	     *       transitions (which are a type of priority propagation).
+	     *       I think a stack of such variables is needed in general
+	     *       though -- kg->kg_user_pri is special because it is at
+	     *       the top.
+	     *
+	     * We scale the base priority so that it agrees with the
+	     * historical nice value for normal user threads, although this
+	     * gives negative numbers for higher priority threads.
+	     *
+	     * PRI_BASE() strips the fifo scheduling bit from the priority
+	     * class.  This is not relevant for the conversion to niceness,
+	     * but it should be shown somewhere other as a raw number in
+	     * an abnormal ps format.  We don't use PRI_IS_REALTIME()
+	     * because there is no corresponding classification macro for
+	     * non-realtime priority classes and the details are too
+	     * messy to be hidden in macros.
+	     *
+	     * KNF indent -ci4 is intentionally violated here.
 	     */
-	    (pp->ki_pri.pri_class ==  PRI_TIMESHARE ?
-	    	pp->ki_nice - NZERO :
-	    	(PRI_IS_REALTIME(pp->ki_pri.pri_class) ?
-		    (PRIO_MIN - 1 - (PRI_MAX_REALTIME - pp->ki_pri.pri_level)) :
-		    (PRIO_MAX + 1 + pp->ki_pri.pri_level - PRI_MIN_IDLE))),
+	    PRI_BASE(pp->ki_pri.pri_class) == PRI_ITHD ?
+		PRIO_MIN + (pp->ki_pri.pri_native - PRI_MIN_TIMESHARE) :
+	    PRI_BASE(pp->ki_pri.pri_class) == PRI_REALTIME ?
+		PRIO_MIN + (pp->ki_pri.pri_user - PRI_MIN_TIMESHARE) :
+	    PRI_BASE(pp->ki_pri.pri_class) == PRI_TIMESHARE ?
+		pp->ki_nice - NZERO :
+	    PRI_BASE(pp->ki_pri.pri_class) == PRI_IDLE ?
+		PRIO_MAX + 1 + (pp->ki_pri.pri_user - PRI_MIN_IDLE) :
+	    666,
 	    format_k2(PROCSIZE(pp)),
 	    format_k2(pagetok(pp->ki_rssize)),
%%%

This area is broken in ps too.  The most obvious ones are:

- My ntpd process (which has realtime priority 0 and is correctly
  displayed by top as having "nice" -52) is displayed by `ps -o rtprio'
  as having priority "real:12".  The bogus 12 is just ntpd's current
  priority less PZERO.  This bug is the same as one of the ones fixed
  above.  It is that pri_level gives the current priority so it gives
  a wrong value to subtract from when the process is running at an
  elevated priority in kernel mode.  top and ps seem to get this wrong
  in RELENG_4 too.

- ps.1 says that `-o rtprio' causes a display of "101" for non-rtprio
  processes, but the actual display is "normal" for normal ones and
  a "%u.%u" format for unknown ones.  The man page became inconsistent
  with the code about "101" back in 1998 in rev.1.26 of ps/print.c.
  I think unknown cases occur for at least POSIX scheduling classes.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20041031223051.R15841>