Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 14 Jan 2026 16:10:15 +0000
From:      Minsoo Choo <minsoochoo0122@proton.me>
To:        freebsd-hackers <freebsd-hackers@freebsd.org>
Subject:   HMP scheduling on FreeBSD
Message-ID:  <0Ng09S3rEB0BvT9vzHqVKU7rWxoad96kjEc7U2LCwDFJKmmswXujip7qbRlo_BIhNKcI7d-2CUHdp9Dxr3-7hhafpD6uagJSFUCjtC9qRr4=@proton.me>

index | next in thread | raw e-mail

[-- Attachment #1 --]
Greetings,

For the last few days, I've been working on scheduler optimization for heterogeneous cores ("HMP scheduling" from now on) on FreeBSD. After days of reading specs, planning structure, and writing code, I came to a model that can be utilized for implementing HMP scheduling. I opened a first revision on Phabricator (D54674) a few days ago but decided to introduce this model in the mailing list and hear others' opinions.

First of all, HMP-related code (scheduling, provider, cpucap framework, etc) are only built and enabled when "options HMP" is specified in kernel configuration. Of course, this shouldn't be activated by default because it's in experiment status. "options HMP" should be added to kernel configuration to enable this featured and this won't be enabled by default until the majority of developers believe that HMP scheduling is stabilized.

The first component of HMP scheduling is "cpucap". One issue with HMP scheduling is that identifying the capacity and scores of a processor (i.e. providers) is machine-dependent while the scheduler code should be machine-independent, so cpucap acts as an interface between the scheduler and providers. CPU capacity and scores are stored in pcpu structure while the machine's cpucap status (e.g. initialized, has dynamic scores, etc) is stored in global cpucap structure of type "cpucap_t". It also includes functions for scheduler and providers, such as accessors, setters, finding "best" cpu, etc. The review (D54674) adds these facilities under HMP option.

What are capacity and scores? Capacity describes cpu's heterogeneity (more information in sys/contrib/device-tree/Bindings/cpu/cpu-capacity.txt). This is static; it is initialized on boot time (e.g. from device tree) and stored in pc_cap_capacity in pcpu after being normalized to 0-1024. If it is not provided, the default value of 1024 will be used. It is static and core specific; if a performance core has capacity of 1024 and a efficiency core has capacity of 600, this means all performance cores have 1024 and all efficiency cores have 600, and whether they are throttled or not, this information stays the same. For this nature, capacity is hint for loading balancing. By dividing a core's capacity by total capacity, we can assign an equal fraction of tasks to the core's run queue. These cpucap information, including enablement status, has dynamic score, individual capacity and scores, can be queried using sysctl.

On the other hand, scores reflect the real time status of a processor and are normalized to 0-1024. Providers like Intel Thread Director give the current status of each core and feed it to pc_cap_scores in pcpu. Scores are used for thread selection. For example, if a performance core is experiencing throttling, its score could go down to 1000. In that case, the scheduler will prefer core that has the highest score. Scores are completely optional because many processors do not provide this, and when score information is absent, cpucap fall backs to capacity. Since scores are dynamic, they are retrieved and set using atomic operations.

Providers feed capacity or score information to cpucap. Capacity providers such as device trees and ACPI CPPC run on boot time and feed capacity information. If there is no capacity provider, (again) the default value is used. Score providers such as Intel ITD are implemented as device drivers. We had ITD implementation patch (D44453-D44459) back in May 2024 called coredirector, but it was neglected. With some modification, this can be the first cpucap provider. Providers cannot be loaded or unloaded on runtime. An exception is Arm SCMI which doesn't feed information to cpucap, but instead, the scheduler should control and request CPU's performance.

Before integrating scheduler and cpucap, I need to go through sched_ule.c​ from top to bottom. After that, I'll add new functions or drop existing ones from the cpucap framework then work on the integration.

If anyone is interested in this implementation, please reply to this thread and/or review my cpucap patch at D54674.

Minsoo
[-- Attachment #2 --]
<div style="font-family: Arial, sans-serif; font-size: 14px;">Greetings,</div><div style="font-family: Arial, sans-serif; font-size: 14px;"><br></div><div style="font-family: Arial, sans-serif; font-size: 14px;">For the last few days, I've been working on scheduler optimization for heterogeneous cores ("HMP scheduling" from now on) on FreeBSD. After days of reading specs, planning structure, and writing code, I came to a model that can be utilized for implementing HMP scheduling. I opened a first revision on Phabricator (D54674) a few days ago but decided to introduce this model in the mailing list and hear others' opinions.</div><div style="font-family: Arial, sans-serif; font-size: 14px;"><br></div><div style="font-family: Arial, sans-serif; font-size: 14px;">First of all, HMP-related code (scheduling, provider, cpucap framework, etc) are only built and enabled when "options HMP" is specified in kernel configuration. Of course, this shouldn't be activated by default because it's in experiment status. "options HMP" should be added to kernel configuration to enable this featured and this won't be enabled by default until the majority of developers believe that HMP scheduling is stabilized.</div><div style="font-family: Arial, sans-serif; font-size: 14px;"><br></div><div style="font-family: Arial, sans-serif; font-size: 14px;">The first component of HMP scheduling is "cpucap". One issue with HMP scheduling is that identifying the capacity and scores of a processor (i.e. providers) is machine-dependent while the scheduler code should be machine-independent, so cpucap acts as an interface between the scheduler and providers. CPU capacity and scores are stored in pcpu structure while the machine's cpucap status (e.g. initialized, has dynamic scores, etc) is stored in global cpucap structure of type "cpucap_t". It also includes functions for scheduler and providers, such as accessors, setters, finding "best" cpu, etc.&nbsp;<span style="text-decoration: none; display: inline !important; background-color: rgb(255, 255, 255);">The review (D54674) adds these facilities under HMP option.</span></div><div style="font-family: Arial, sans-serif; font-size: 14px;"><br></div><div style="font-family: Arial, sans-serif; font-size: 14px;">What are capacity and scores? Capacity describes cpu's heterogeneity (more information in sys/contrib/device-tree/Bindings/cpu/cpu-capacity.txt). This is static; it is initialized on boot time (e.g. from device tree) and stored in pc_cap_capacity in pcpu after being normalized to 0-1024. If it is not provided, the default value of 1024 will be used. It is static and core specific; if a performance core has capacity of 1024 and a efficiency core has capacity of 600, this means all performance cores have 1024 and all efficiency cores have 600, and whether they are throttled or not, this information stays the same. For this nature, capacity is hint for loading balancing. By dividing a core's capacity by total capacity, we can assign an equal fraction of tasks to the core's run queue. These cpucap information, including enablement status, has dynamic score, individual capacity and scores, can be queried using sysctl.</div><div style="font-family: Arial, sans-serif; font-size: 14px;"><br></div><div style="font-family: Arial, sans-serif; font-size: 14px;">On the other hand, scores reflect the real time status of a processor and are normalized to 0-1024. Providers like Intel Thread Director give the current status of each core and feed it to pc_cap_scores in pcpu. Scores are used for thread selection. For example, if a performance core is experiencing throttling, its score could go down to 1000. In that case, the scheduler will prefer core that has the highest score. Scores are completely optional because many processors do not provide this, and when score information is absent, cpucap fall backs to capacity. Since scores are dynamic, they are retrieved and set using atomic operations.</div><div style="font-family: Arial, sans-serif; font-size: 14px;"><br></div><div style="font-family: Arial, sans-serif; font-size: 14px;">Providers feed capacity or score information to cpucap. Capacity providers such as device trees and ACPI CPPC run on boot time and feed capacity information. If there is no capacity provider, (again) the default value is used. Score providers such as Intel ITD are implemented as device drivers. We had ITD implementation patch <span style="text-decoration: none; display: inline !important; background-color: rgb(255, 255, 255);">(</span><span style="scrollbar-width:thin;text-decoration-line:none;text-decoration-thickness:auto;text-decoration-style:solid">D44453-D44459)&nbsp;</span>back in May 2024 called coredirector<span>, but it was neglected. With some modification, this can be the first cpucap provider. Providers cannot be loaded or unloaded on runtime. An exception is Arm SCMI which&nbsp;<span style="text-decoration: none; display: inline !important; background-color: rgb(255, 255, 255);">doesn't feed information to cpucap, but instead,&nbsp;</span>the scheduler should control and request CPU's performance.</span></div><div style="font-family: Arial, sans-serif; font-size: 14px;"><span><br></span></div><div style="font-family: Arial, sans-serif; font-size: 14px;">Before integrating scheduler and cpucap, I need to go through <code>sched_ule.c</code>​ from top to bottom. After that, I'll add new functions or drop existing ones from the cpucap framework then work on the integration.</div><div style="font-family: Arial, sans-serif; font-size: 14px;"><br></div><div style="font-family: Arial, sans-serif; font-size: 14px;">If anyone is interested in this implementation, please reply to this thread and/or review my cpucap patch at D54674.</div><div style="font-family: Arial, sans-serif; font-size: 14px;"><br></div><div style="font-family: Arial, sans-serif; font-size: 14px;">Minsoo</div><div style="font-family: Arial, sans-serif; font-size: 14px;" class="protonmail_signature_block">
</div>
home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0Ng09S3rEB0BvT9vzHqVKU7rWxoad96kjEc7U2LCwDFJKmmswXujip7qbRlo_BIhNKcI7d-2CUHdp9Dxr3-7hhafpD6uagJSFUCjtC9qRr4=>