Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 15 Jan 2026 08:34:50 +0900
From:      Tomoaki AOKI <junchoon@dec.sakura.ne.jp>
To:        Olivier Certner <olce@freebsd.org>
Cc:        Minsoo Choo <minsoochoo0122@proton.me>, freebsd-hackers <freebsd-hackers@freebsd.org>
Subject:   Re: HMP scheduling on FreeBSD
Message-ID:  <20260115083450.f20c13f24d2ebbc68db9cd01@dec.sakura.ne.jp>
In-Reply-To: <1886427.OVFmXjEfDW@ravel>
References:  <0Ng09S3rEB0BvT9vzHqVKU7rWxoad96kjEc7U2LCwDFJKmmswXujip7qbRlo_BIhNKcI7d-2CUHdp9Dxr3-7hhafpD6uagJSFUCjtC9qRr4=@proton.me> <1886427.OVFmXjEfDW@ravel>

index | next in thread | previous in thread | raw e-mail

On Wed, 14 Jan 2026 23:14:52 +0100
Olivier Certner <olce@freebsd.org> wrote:

> Hi Minsoo,
> 
> > For the last few days, I've been working on scheduler optimization for heterogeneous cores ("HMP scheduling" from now on) on FreeBSD.
> 
> That's great!  I've also been working on it, albeit in a slow fashion and mostly in the background, rather focusing on scheduler design and integration on our cpusets.
> 
> Giving quickly some first comments.
> 
> > The first component of HMP scheduling is "cpucap". One issue with HMP scheduling is that identifying the capacity and scores of a processor (i.e. providers) is machine-dependent while the scheduler code should be machine-independent, so cpucap acts as an interface between the scheduler and providers. CPU capacity and scores are stored in pcpu structure while the machine's cpucap status (e.g. initialized, has dynamic scores, etc) is stored in global cpucap structure of type "cpucap_t". It also includes functions for scheduler and providers, such as accessors, setters, finding "best" cpu, etc. The review (D54674) adds these facilities under HMP option.
> 
> I'll review D54674, but not in the immediate.  Hopefully next week.
>  
> > By dividing a core's capacity by total capacity, we can assign an equal fraction of tasks to the core's run queue.
> >
> > On the other hand, scores reflect the real time status of a processor (snip). For example, if a performance core is experiencing throttling, its score could go down to 1000. In that case, the scheduler will prefer core that has the highest score.
> 
> These are good first observations but they can only really apply in specific circumstances.  Converting core's capacity in run queue length can only drive a loaded system, not a mostly idle one.  This mechanism will also cause an increase in latency for threads running on performant cores.
> 
> There are several theoretical considerations that should be met *together*, such as fairness, latency, bias to performance or to energy (policy), affinity, cpusets (directives), etc., and...
> 
> > Before integrating scheduler and cpucap, I need to go through sched_ule.c​ from top to bottom. After that, I'll add new functions or drop existing ones from the cpucap framework then work on the integration.
> 
> ...there are some practical considerations too.  ULE maintains per-CPU run queues and does inter-CPU thread exchange relatively infrequently (through the so-called "long-term long balancer") for fairness.  It will not exchange two threads on two different cores if there are the only ones running, which again is unfair if the two cores have different performances.
> 
> A general scheduler must cater to a variety of workloads, and it can be quite difficult to improve some characteristics without degrading others.  We certainly don't want to rush things.
> 
> I invite you to read the https://wiki.freebsd.org/Scheduler/Hybrid for a glimpse on some of the trade-offs involved and a wider perspective, which however is by no means complete and for which input from you and any other interested parties is welcome.
> 
> Thanks and regards.
> 
> -- 
> Olivier Certner

Hi.

Not yet read diffs and existing codes, sorry if it's already done
or known not works effectively.

Just an idea, if existing schedulers are already NUMA aware,
adding another layer describing the attributes of cores as leaves of
each NUMA domain could help. This is because (AFAIK) single NUMA domain
could have different types of cores.

Regards.

-- 
Tomoaki AOKI    <junchoon@dec.sakura.ne.jp>


home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20260115083450.f20c13f24d2ebbc68db9cd01>