Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 23 Mar 2023 11:01:44 +0000
From:      Bob Bishop <rb@gid.co.uk>
To:        David Chisnall <theraven@freebsd.org>
Cc:        "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Subject:   Re: Periodic rant about SCHED_ULE
Message-ID:  <5A2888BF-AB1F-40E0-AF42-ECE8D32AD908@gid.co.uk>
In-Reply-To: <a61f759e-9aea-d77f-6d5e-cecafdfe60b3@FreeBSD.org>
References:  <a401e51a-250a-64a0-15cb-ff79bcefbf94@m5p.com> <8173cc7e-e934-dd5c-312a-1dfa886941aa@FreeBSD.org> <8cfdb951-9b1f-ecd3-2291-7a528e1b042c@m5p.com> <c3f5f667-ba0b-c40c-b8a6-19d1c9c63c5f@FreeBSD.org> <ZBtRJhNHluj5Nzyk@troutmask.apl.washington.edu> <CAGudoHEj%2BkoaYhkjzDE5KX9OsCno=X5M_E3z9uwg6Pg7dtqTsA@mail.gmail.com> <7f26102c-7542-78f8-0c7b-ef3cdaa1a4a6@FreeBSD.org> <a61f759e-9aea-d77f-6d5e-cecafdfe60b3@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi,

On 23 Mar 2023, at 10:07, David Chisnall <theraven@freebsd.org> wrote:
>=20
> On 22/03/2023 20:15, Stefan Esser wrote:
>> Better balancing of the load would probably make ULE take less real
>> time. The example of 9 identical tasks on 8 cores with 7 tasks =
getting
>> 100% of a core and the other 2 sharing a core and getting 50% each
>> could be resolved by moving a CPU bound process from the CPU with the
>> highest load to a random CPU (probably not the one with the lowest =
load
>> or limited to the same cluster or NUMA domain, since then it would =
stay
>> in a subset of the available cores).
>=20
> Two things have changed in CPUs since ULE was written that make the =
affinity less of a win and may make some low-frequency random =
rebalancing better:
>=20
> Snopping from another core's L1 is a lot cheaper (less true on =
multi-socket systems, but fortunately ULE is NUMA-aware and so can =
factor this in), which makes the cost of migrating a thread to another =
core much cheaper (there are still kernel synchronisation costs, but the =
cost of running on a core that doesn't have a warm cache is lower: the =
caches warm very quickly).
>=20
> CPUs now have a lot more power domains.  If one core is doing a lot =
more work than others then there's a good chance that it will be =
thermally throttled but others may not if they're in a separate power / =
thermal domain.  This means that keeping a compute-bound process on the =
same core is the worst thing that you can do if other cores are idle: =
that core may be throttled back to <2 GHz whereas a core on the other =
side of the chip may be able to run at >3 GHz.  Evenly heating the =
entire CPU can have give much better performance if the number of active =
threads is less than the number of running cores and better fairness in =
other cases.
>=20
> Both ULE and 4BSD are unaware of the heterogeneity of modern CPUs, =
which often have 2-3 different kinds of core that run at different =
speeds and neither understands a concept of a power budget, so there's a =
lot of potential improvement here.  Writing a bad (but working) =
scheduler is a fairly difficult task, writing a good one is much harder, =
so I'm not volunteering to do it, but if someone is interested then it =
would probably be a good candidate for Foundation funding.  I've heard =
good things about the XNU scheduler recently, that might be a good =
source of inspiration.
>=20
> David
>=20

This is spot on as a summary of the landscape. The MacOS scheduler =
(based on XNU) [1] seems to do a pretty good job with heterogeneous =
cores vs power management, and MacOS has APIs allowing applications to =
take account of the thermal state of the total system[2]. But, I =
haven=E2=80=99t seen any references to fine-grained thermal management =
as outlined above.

[1] =
https://developer.apple.com/library/archive/documentation/Darwin/Conceptua=
l/KernelProgramming/scheduler/scheduler.html
[2] =
https://developer.apple.com/library/archive/documentation/Performance/Conc=
eptual/power_efficiency_guidelines_osx/RespondToThermalStateChanges.html

--
Bob Bishop
rb@gid.co.uk







Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5A2888BF-AB1F-40E0-AF42-ECE8D32AD908>