Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 19 May 2023 01:29:29 +0100
From:      Johannes Totz <jo@bruelltuete.com>
To:        FreeBSD Hackers <freebsd-hackers@FreeBSD.org>
Subject:   Re: cpufreq & hwpstate_amd & Zen 2
Message-ID:  <d58be9b0-c4a5-1a57-1286-a095d745c996@bruelltuete.com>
In-Reply-To: <576641a3-8b9c-fedb-67a6-a5c61a52f654@bruelltuete.com>
References:  <576641a3-8b9c-fedb-67a6-a5c61a52f654@bruelltuete.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 15/05/2023 22:16, Johannes Totz wrote:
> Hi all,
> 
> I'm poking cpufreq's hwpstate_amd to see what I can tune re performance 
> vs power vs heat trade-off.

Here are some patches, if anyone is interested:

https://reviews.freebsd.org/D40139
Adds a tunable for cpufreq/hwpstate to get the P-state info from the 
CPU's MSR instead of acpi_perf.

https://reviews.freebsd.org/D40158
Adds another tunable that allows overriding the default (or 
BIOS-configured?) P-state configuration. Stuff like over- or 
underclocking and -volting.

https://reviews.freebsd.org/D40140
Adds power calculation if P-state info comes from MSR. This was missing 
until now but is really just cosmetic.

These do not solve the mystery below though :(
And fwiw, C-state power saving is really effective. Messing with the 
P-states does not do much while idle, it's measurable only when the CPU 
is busy.

> I'm struggling with the P-state behaviour though.
> The code looks really straight-forward: 
> https://github.com/freebsd/freebsd-src/blob/main/sys/x86/cpufreq/hwpstate_amd.c#L172
> 
> But enabling hwpstate_verify, it looks like P-state transitions never go 
> as requested.
> For this, I'm not running powerd.
> In addition to the existing verify code, I've sprinkled in a few more 
> printfs.
> 
> PStateCurLim (aka MSR_AMD_10H_11H_LIMIT = 0x20) and PStateDef (aka 
> MSR_AMD_10H_11H_CONFIG = eg 0x8000000049120890) look all reasonable.
> 
> 
> $ sysctl dev.cpu.0
> dev.cpu.0.freq_levels: 3600/3960 2800/2800 2200/1980
> dev.cpu.0.freq: 2800
> 
> $ sysctl dev.cpu.0.freq=3600
> dev.cpu.0.freq: 2800 -> 3600
> 
> $ cat /var/log/messages
> [...extra printf debugging...]
> kernel: hwpstate0: setting P0-state on cpu0
> kernel: hwpstate0: setting P1(2) -> P0 on cpu1
> [...same for all the other cpus...]
> kernel: hwpstate0: setting P1(2) -> P0 on cpu15
> 
> 
> This shows that cpufreq thought we were at P1 and wanted to transition 
> to P0. But actually, the CPU was in P2 (the 2 in brackets).
> 
> We want to go from P0 to P2...
> 
> 
> $ sysctl dev.cpu.0.freq=2200
> dev.cpu.0.freq: 3600 -> 2200
> 
> $ cat /var/log/messages
> kernel: hwpstate0: setting P2-state on cpu0
> kernel: hwpstate0: setting P0(1) -> P2 on cpu1
> 
> 
> ...but CPU was in P1 at that time.
> 
> Wanting to go from P2 back to P1...
> 
> 
> $ sysctl dev.cpu.0.freq=2800
> dev.cpu.0.freq: 2200 -> 2800
> 
> $ cat /var/log/messages
> kernel: hwpstate0: setting P1-state on cpu0
> kernel: hwpstate0: setting P2(2) -> P1 on cpu1
> 
> 
> ...shows that this time the CPU really was in P2 (yeay). But it did not 
> transition to P1, it stayed in P2 (not shown in the log).
> 
> 
> So question is: what else could be interfering with P-state?
> 
> 
> thanks,
> 
> Johannes




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?d58be9b0-c4a5-1a57-1286-a095d745c996>