Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 18 Jun 2025 02:13:33 GMT
From:      Olivier Certner <olce@FreeBSD.org>
To:        src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org
Subject:   git: dee257c28d93 - main - sched: Internal priority ranges: Reduce kernel, increase timeshare
Message-ID:  <202506180213.55I2DXA2024716@gitrepo.freebsd.org>

next in thread | raw e-mail | index | archive | help
The branch main has been updated by olce:

URL: https://cgit.FreeBSD.org/src/commit/?id=dee257c28d936bb7b459d3eda207531b3cf1bd4e

commit dee257c28d936bb7b459d3eda207531b3cf1bd4e
Author:     Olivier Certner <olce@FreeBSD.org>
AuthorDate: 2024-05-22 20:32:18 +0000
Commit:     Olivier Certner <olce@FreeBSD.org>
CommitDate: 2025-06-18 02:09:37 +0000

    sched: Internal priority ranges: Reduce kernel, increase timeshare
    
    Now that a difference of 1 in priority level is significant, we can
    shrink the priority range reserved for kernel threads.
    
    Only four distinct levels are necessary for the bottom half (3 base
    levels and arguably an additional one for demoted interrupt threads that
    run for full time slices so that they finally don't compete with other
    ones).  To leave room for other possible uses, we settle on 8 levels.
    
    Given the symbolic constants for the top half, 10 levels are currently
    necessary.  We settle on 16 levels.
    
    This allows to enlarge the timesharing range, which covers ULE's both
    interactive and batch range, to 168 distinct levels from less than 64
    ones for ULE (as of before the changes to make it use a single runqueue
    and have 256 distinct levels per runqueue) and 34 ones for 4BSD.
    
    While here, note that the realtime range is required to have at least 32
    priority levels since:
    - POSIX mandates at least 32 distinct levels for the SCHED_RR/SCHED_FIFO
      scheduling policies.
    - We directly map contiguous priority levels ('sched_priority') of these
      scheduling policies to distinct, contiguous internal priority levels.
    Conversely, having at least 32 priority levels is enough to guarantee
    compliance to the POSIX requirement mentioned above because different
    internal priority levels are treated differently since commit "runq:
    Switch to 256 levels".
    
    While here, list explicit change restrictions for the realtime and idle
    range.
    
    MFC after:      1 month
    Event:          Kitchener-Waterloo Hackathon 202506
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D45391
---
 sys/sys/priority.h | 52 +++++++++++++++++++++++++++++-----------------------
 1 file changed, 29 insertions(+), 23 deletions(-)

diff --git a/sys/sys/priority.h b/sys/sys/priority.h
index 4428a85c0987..93dd5aa90d95 100644
--- a/sys/sys/priority.h
+++ b/sys/sys/priority.h
@@ -64,17 +64,23 @@
  */
 
 /*
- * Priorities range from 0 to 255, but differences of less then 4 (RQ_PPQ)
- * are insignificant.  Ranges are as follows:
+ * Priorities range from 0 to 255.  Ranges are as follows:
  *
- * Interrupt threads:		0 - 15
- * Realtime user threads:	16 - 47
- * Top half kernel threads:	48 - 87
- * Time sharing user threads:	88 - 223
+ * Interrupt threads:		0 - 7
+ * Realtime user threads:	8 - 39
+ * Top half kernel threads:	40 - 55
+ * Time sharing user threads:	56 - 223
  * Idle user threads:		224 - 255
  *
- * XXX If/When the specific interrupt thread and top half thread ranges
- * disappear, a larger range can be used for user processes.
+ * Priority levels of rtprio(2)'s RTP_PRIO_FIFO and RTP_PRIO_REALTIME and
+ * POSIX's SCHED_FIFO and SCHED_RR are directly mapped to the internal realtime
+ * range mentioned above by a simple translation.  This range's length
+ * consequently cannot be changed without impacts on the scheduling priority
+ * code, and in any case must never be smaller than 32 for POSIX compliance and
+ * rtprio(2) backwards compatibility.  Similarly, priority levels of rtprio(2)'s
+ * RTP_PRIO_IDLE are directly mapped to the internal idle range above (and,
+ * soon, those of the to-be-introduced SCHED_IDLE policy as well), so changing
+ * that range is subject to the same caveats and restrictions.
  */
 
 #define	PRI_MIN			(0)		/* Highest priority. */
@@ -88,34 +94,34 @@
  * decay to lower priorities if they run for full time slices.
  */
 #define	PI_REALTIME		(PRI_MIN_ITHD + 0)
-#define	PI_INTR			(PRI_MIN_ITHD + 4)
+#define	PI_INTR			(PRI_MIN_ITHD + 1)
 #define	PI_AV			PI_INTR
 #define	PI_NET			PI_INTR
 #define	PI_DISK			PI_INTR
 #define	PI_TTY			PI_INTR
 #define	PI_DULL			PI_INTR
-#define	PI_SOFT			(PRI_MIN_ITHD + 8)
+#define	PI_SOFT			(PRI_MIN_ITHD + 2)
 #define	PI_SOFTCLOCK		PI_SOFT
 #define	PI_SWI(x)		PI_SOFT
 
-#define	PRI_MIN_REALTIME	(16)
+#define	PRI_MIN_REALTIME	(8)
 #define	PRI_MAX_REALTIME	(PRI_MIN_KERN - 1)
 
-#define	PRI_MIN_KERN		(48)
+#define	PRI_MIN_KERN		(40)
 #define	PRI_MAX_KERN		(PRI_MIN_TIMESHARE - 1)
 
 #define	PSWP			(PRI_MIN_KERN + 0)
-#define	PVM			(PRI_MIN_KERN + 4)
-#define	PINOD			(PRI_MIN_KERN + 8)
-#define	PRIBIO			(PRI_MIN_KERN + 12)
-#define	PVFS			(PRI_MIN_KERN + 16)
-#define	PZERO			(PRI_MIN_KERN + 20)
-#define	PSOCK			(PRI_MIN_KERN + 24)
-#define	PWAIT			(PRI_MIN_KERN + 28)
-#define	PLOCK			(PRI_MIN_KERN + 32)
-#define	PPAUSE			(PRI_MIN_KERN + 36)
-
-#define	PRI_MIN_TIMESHARE	(88)
+#define	PVM			(PRI_MIN_KERN + 1)
+#define	PINOD			(PRI_MIN_KERN + 2)
+#define	PRIBIO			(PRI_MIN_KERN + 3)
+#define	PVFS			(PRI_MIN_KERN + 4)
+#define	PZERO			(PRI_MIN_KERN + 5)
+#define	PSOCK			(PRI_MIN_KERN + 6)
+#define	PWAIT			(PRI_MIN_KERN + 7)
+#define	PLOCK			(PRI_MIN_KERN + 8)
+#define	PPAUSE			(PRI_MIN_KERN + 9)
+
+#define	PRI_MIN_TIMESHARE	(56)
 #define	PRI_MAX_TIMESHARE	(PRI_MIN_IDLE - 1)
 
 #define	PUSER			(PRI_MIN_TIMESHARE)



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?202506180213.55I2DXA2024716>