From nobody Wed Jan 14 22:22:30 2026 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4ds0ww0KGYz6P0kY for ; Wed, 14 Jan 2026 22:22:40 +0000 (UTC) (envelope-from vermaden@interia.pl) Received: from smtpo75.interia.pl (smtpo75.interia.pl [217.74.67.75]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4ds0wv3sJWz3KCM for ; Wed, 14 Jan 2026 22:22:39 +0000 (UTC) (envelope-from vermaden@interia.pl) Authentication-Results: mx1.freebsd.org; none Date: Wed, 14 Jan 2026 23:22:30 +0100 From: vermaden Subject: Re: HMP scheduling on FreeBSD To: Minsoo Choo , freebsd-hackers X-Mailer: interia.pl/pf09 In-Reply-To: <0Ng09S3rEB0BvT9vzHqVKU7rWxoad96kjEc7U2LCwDFJKmmswXujip7qbRlo_BIhNKcI7d-2CUHdp9Dxr3-7hhafpD6uagJSFUCjtC9qRr4=@proton.me> References: <0Ng09S3rEB0BvT9vzHqVKU7rWxoad96kjEc7U2LCwDFJKmmswXujip7qbRlo_BIhNKcI7d-2CUHdp9Dxr3-7hhafpD6uagJSFUCjtC9qRr4=@proton.me> X-Originating-IP: 45.148.42.20 Message-Id: List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@FreeBSD.org MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="=-qYugoi0WV8X4mxSZBSyM" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=interia.pl; s=dk; t=1768429352; bh=2Cxru1X2vUtGGVykHGGVLP1cIsHhGgYERzzZKgXWOA0=; h=Date:From:Subject:To:Message-Id:MIME-Version:Content-Type; b=H7U9upbK+S5ttTj7nwLFPprfGkQ0Z3DXSBX7OnZBNwUzAbzmatJf/J6xgtfNM7HH5 Pe3lwuZhb/qGKv0yQPV/a3g6PXsOYxLewC0Ao6NYmJndZpkOaCRqHBTZEqLCi5+ieb F1T/VcZJditYpvAEL1/vYZHVMBTr5YErn7dWqPFs= X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:16138, ipnet:217.74.64.0/22, country:PL] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Rspamd-Queue-Id: 4ds0wv3sJWz3KCM --=-qYugoi0WV8X4mxSZBSyM Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi,one may also think about adding rc.conf(5) settings like:    s= ched_fast=3D"postgres nginx"    sched_slow=3D"cron at zfs-scrub"T= o force using faster or slower cores for various tasks.Regards,vermadenTema= t: HMP scheduling on FreeBSDData: 2026-01-14 17:10Nadawca: "Minsoo Choo" &l= t;minsoochoo0122@proton.me>Adresat: "freebsd-hackers" <freebsd-hacker= s@freebsd.org>; Greetings,For the last few days, I've been working on sc= heduler optimization for heterogeneous cores ("HMP scheduling" from now on)= on FreeBSD. After days of reading specs, planning structure, and writing c= ode, I came to a model that can be utilized for implementing HMP scheduling= . I opened a first revision on Phabricator (D54674) a few days ago but deci= ded to introduce this model in the mailing list and hear others' opinions.F= irst of all, HMP-related code (scheduling, provider, cpucap framework, etc)= are only built and enabled when "options HMP" is specified in kernel confi= guration. Of course, this shouldn't be activated by default because it's in= experiment status. "options HMP" should be added to kernel configuration t= o enable this featured and this won't be enabled by default until the major= ity of developers believe that HMP scheduling is stabilized.The first compo= nent of HMP scheduling is "cpucap". One issue with HMP scheduling is that i= dentifying the capacity and scores of a processor (i.e. providers) is machi= ne-dependent while the scheduler code should be machine-independent, so cpu= cap acts as an interface between the scheduler and providers. CPU capacity = and scores are stored in pcpu structure while the machine's cpucap status (= e.g. initialized, has dynamic scores, etc) is stored in global cpucap struc= ture of type "cpucap_t". It also includes functions for scheduler and provi= ders, such as accessors, setters, finding "best" cpu, etc. The review = (D54674) adds these facilities under HMP option.What are capacity and score= s? Capacity describes cpu's heterogeneity (more information in sys/contrib/= device-tree/Bindings/cpu/cpu-capacity.txt). This is static; it is initializ= ed on boot time (e.g. from device tree) and stored in pc_cap_capacity in pc= pu after being normalized to 0-1024. If it is not provided, the default val= ue of 1024 will be used. It is static and core specific; if a performance c= ore has capacity of 1024 and a efficiency core has capacity of 600, this me= ans all performance cores have 1024 and all efficiency cores have 600, and = whether they are throttled or not, this information stays the same. For thi= s nature, capacity is hint for loading balancing. By dividing a core's capa= city by total capacity, we can assign an equal fraction of tasks to the cor= e's run queue. These cpucap information, including enablement status, has d= ynamic score, individual capacity and scores, can be queried using sysctl.O= n the other hand, scores reflect the real time status of a processor and ar= e normalized to 0-1024. Providers like Intel Thread Director give the curre= nt status of each core and feed it to pc_cap_scores in pcpu. Scores are use= d for thread selection. For example, if a performance core is experiencing = throttling, its score could go down to 1000. In that case, the scheduler wi= ll prefer core that has the highest score. Scores are completely optional b= ecause many processors do not provide this, and when score information is a= bsent, cpucap fall backs to capacity. Since scores are dynamic, they are re= trieved and set using atomic operations.Providers feed capacity or score in= formation to cpucap. Capacity providers such as device trees and ACPI CPPC = run on boot time and feed capacity information. If there is no capacity pro= vider, (again) the default value is used. Score providers such as Intel ITD= are implemented as device drivers. We had ITD implementation patch (D44453= -D44459) back in May 2024 called coredirector, but it was neglected. W= ith some modification, this can be the first cpucap provider. Providers can= not be loaded or unloaded on runtime. An exception is Arm SCMI which d= oesn't feed information to cpucap, but instead, the scheduler should c= ontrol and request CPU's performance.Before integrating scheduler and cpuca= p, I need to go through sched_ule.c=E2=80=8B from top to bottom. After that= , I'll add new functions or drop existing ones from the cpucap framework th= en work on the integration.If anyone is interested in this implementation, = please reply to this thread and/or review my cpucap patch at D54674.Minsoo&= nbsp; --=-qYugoi0WV8X4mxSZBSyM Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi,

one may also think about adding rc.conf(5) settings like:

=C2=A0 = =C2=A0 sched_fast=3D"postgres nginx"
=C2=A0 =C2=A0 sched_slow=3D"cron at= zfs-scrub"

To force using faster or slower cores for various tasks.=

Regards,
vermaden




Temat: HMP scheduling on FreeBSD
Data: 2026-01-14 17:10
Nadawca: "= Minsoo Choo" <minsoochoo0122@proton.me>
Adresat: "freebsd-hackers"= <freebsd-hackers@freebsd.org>;



Greetings,


First of all, HMP-related code (scheduling, prov= ider, cpucap framework, etc) are only built and enabled when "options HMP" = is specified in kernel configuration. Of course, this shouldn't be activate= d by default because it's in experiment status. "options HMP" should be add= ed to kernel configuration to enable this featured and this won't be enable= d by default until the majority of developers believe that HMP scheduling i= s stabilized.

The fir= st component of HMP scheduling is "cpucap". One issue with HMP scheduling i= s that identifying the capacity and scores of a processor (i.e. providers) = is machine-dependent while the scheduler code should be machine-independent= , so cpucap acts as an interface between the scheduler and providers. CPU c= apacity and scores are stored in pcpu structure while the machine's cpucap = status (e.g. initialized, has dynamic scores, etc) is stored in global cpuc= ap structure of type "cpucap_t". It also includes functions for scheduler a= nd providers, such as accessors, setters, finding "best" cpu, etc.=C2=A0The review (D54674) adds these facili= ties under HMP option.

What are capacity and scores? Capacity describes cpu's heterogenei= ty (more information in sys/contrib/device-tree/Bindings/cpu/cpu-capacity.t= xt). This is static; it is initialized on boot time (e.g. from device tree)= and stored in pc_cap_capacity in pcpu after being normalized to 0-1024. If= it is not provided, the default value of 1024 will be used. It is static a= nd core specific; if a performance core has capacity of 1024 and a efficien= cy core has capacity of 600, this means all performance cores have 1024 and= all efficiency cores have 600, and whether they are throttled or not, this= information stays the same. For this nature, capacity is hint for loading = balancing. By dividing a core's capacity by total capacity, we can assign a= n equal fraction of tasks to the core's run queue. These cpucap information= , including enablement status, has dynamic score, individual capacity and s= cores, can be queried using sysctl.

On the other hand, scores reflect the real time status of a = processor and are normalized to 0-1024. Providers like Intel Thread Directo= r give the current status of each core and feed it to pc_cap_scores in pcpu= . Scores are used for thread selection. For example, if a performance core = is experiencing throttling, its score could go down to 1000. In that case, = the scheduler will prefer core that has the highest score. Scores are compl= etely optional because many processors do not provide this, and when score = information is absent, cpucap fall backs to capacity. Since scores are dyna= mic, they are retrieved and set using atomic operations.

Providers feed capacity or score inform= ation to cpucap. Capacity providers such as device trees and ACPI CPPC run = on boot time and feed capacity information. If there is no capacity provide= r, (again) the default value is used. Score providers such as Intel ITD are= implemented as device drivers. We had ITD implementation patch (D44453-D44459)=C2=A0back in May 2024 called coredirector, but it wa= s neglected. With some modification, this can be the first cpucap provider.= Providers cannot be loaded or unloaded on runtime. An exception is Arm SCM= I which=C2=A0doesn't feed informat= ion to cpucap, but instead,=C2=A0the scheduler should control and re= quest CPU's performance.

Before integrating scheduler and cpucap, I need to go through sch= ed_ule.c=E2=80=8B from top to bottom. After that, I'll add new funct= ions or drop existing ones from the cpucap framework then work on the integ= ration.

<= /div>
If anyone is = interested in this implementation, please reply to this thread and/or revie= w my cpucap patch at D54674.

Minsoo
=C2=A0


--=-qYugoi0WV8X4mxSZBSyM--