Date: Fri, 19 May 2023 01:29:29 +0100 From: Johannes Totz <jo@bruelltuete.com> To: FreeBSD Hackers <freebsd-hackers@FreeBSD.org> Subject: Re: cpufreq & hwpstate_amd & Zen 2 Message-ID: <d58be9b0-c4a5-1a57-1286-a095d745c996@bruelltuete.com> In-Reply-To: <576641a3-8b9c-fedb-67a6-a5c61a52f654@bruelltuete.com> References: <576641a3-8b9c-fedb-67a6-a5c61a52f654@bruelltuete.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 15/05/2023 22:16, Johannes Totz wrote: > Hi all, > > I'm poking cpufreq's hwpstate_amd to see what I can tune re performance > vs power vs heat trade-off. Here are some patches, if anyone is interested: https://reviews.freebsd.org/D40139 Adds a tunable for cpufreq/hwpstate to get the P-state info from the CPU's MSR instead of acpi_perf. https://reviews.freebsd.org/D40158 Adds another tunable that allows overriding the default (or BIOS-configured?) P-state configuration. Stuff like over- or underclocking and -volting. https://reviews.freebsd.org/D40140 Adds power calculation if P-state info comes from MSR. This was missing until now but is really just cosmetic. These do not solve the mystery below though :( And fwiw, C-state power saving is really effective. Messing with the P-states does not do much while idle, it's measurable only when the CPU is busy. > I'm struggling with the P-state behaviour though. > The code looks really straight-forward: > https://github.com/freebsd/freebsd-src/blob/main/sys/x86/cpufreq/hwpstate_amd.c#L172 > > But enabling hwpstate_verify, it looks like P-state transitions never go > as requested. > For this, I'm not running powerd. > In addition to the existing verify code, I've sprinkled in a few more > printfs. > > PStateCurLim (aka MSR_AMD_10H_11H_LIMIT = 0x20) and PStateDef (aka > MSR_AMD_10H_11H_CONFIG = eg 0x8000000049120890) look all reasonable. > > > $ sysctl dev.cpu.0 > dev.cpu.0.freq_levels: 3600/3960 2800/2800 2200/1980 > dev.cpu.0.freq: 2800 > > $ sysctl dev.cpu.0.freq=3600 > dev.cpu.0.freq: 2800 -> 3600 > > $ cat /var/log/messages > [...extra printf debugging...] > kernel: hwpstate0: setting P0-state on cpu0 > kernel: hwpstate0: setting P1(2) -> P0 on cpu1 > [...same for all the other cpus...] > kernel: hwpstate0: setting P1(2) -> P0 on cpu15 > > > This shows that cpufreq thought we were at P1 and wanted to transition > to P0. But actually, the CPU was in P2 (the 2 in brackets). > > We want to go from P0 to P2... > > > $ sysctl dev.cpu.0.freq=2200 > dev.cpu.0.freq: 3600 -> 2200 > > $ cat /var/log/messages > kernel: hwpstate0: setting P2-state on cpu0 > kernel: hwpstate0: setting P0(1) -> P2 on cpu1 > > > ...but CPU was in P1 at that time. > > Wanting to go from P2 back to P1... > > > $ sysctl dev.cpu.0.freq=2800 > dev.cpu.0.freq: 2200 -> 2800 > > $ cat /var/log/messages > kernel: hwpstate0: setting P1-state on cpu0 > kernel: hwpstate0: setting P2(2) -> P1 on cpu1 > > > ...shows that this time the CPU really was in P2 (yeay). But it did not > transition to P1, it stayed in P2 (not shown in the log). > > > So question is: what else could be interfering with P-state? > > > thanks, > > Johannes
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?d58be9b0-c4a5-1a57-1286-a095d745c996>