Date: Thu, 05 Aug 2021 17:35:35 +0000 From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 257641] hwpmc/libpmc needs to gain a notion of big.LITTLE Message-ID: <bug-257641-227@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D257641 Bug ID: 257641 Summary: hwpmc/libpmc needs to gain a notion of big.LITTLE Product: Base System Version: Unspecified Hardware: Any OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: mhorne@freebsd.org Some systems that FreeBSD supports contain a heterogeneous collection of CP= Us. This is present in ARM's big.LITTLE chips, such as the rockpro64, and will = be a feature of some next-generation x86 chips as well [1][2]. The PMC stack was written in a time before these heterogeneous systems, and thus the assumpti= on of homogeneous support for performance monitoring capabilities among all co= res in the system is ingrained. This is stated explicitly in the hwpmc(4) man p= age under IMPLEMENTATION NOTES. In the case of the rockpro64/RK3399, it contains four Cortex-a53 cores and = two larger Cortex-a72 cores. There is some overlap of supported performance eve= nts between the two types, but some events that are unique to each. This poses problems that hwpmc is not currently equipped to deal with. The first problem to solve is CPU reporting. There are two ways this is communicated from the kernel to libpmc, via the kern.hwpmc.cpuid sysctl and= the PMC_OP_GETCPUINFO operation on the hwpmc syscall. Neither of these methods = make a distinction between different CPUs in the system, so the value received by userspace basically depends on which CPU does the initialization of the hwp= mc module. This somehow needs to become a per-CPU value, in order to properly detect which events are supported on a given core. Assuming this is solved, the basic high-level behaviour will depend on the = type of PMC being allocated: System-scope PMCs: Allocating a system-scope counter with e.g. pmcstat -s <event> will attempt= to allocate the event on every CPU in the system. If the allocation fails for = any CPU, the command will not proceed with any measurement. This has reasonable behaviour on a heterogeneous system, where the user needs to either pick an event that is compatible with all CPUs, or use the -c flag to qualify the selected CPUs. Process-scope PMCs: Allocating a process-scope counter is slightly more problematic. Suppose a = PMC counter is allocated on CPU A, where the target process is running and the requested event is supported. If the process is migrated to CPU B, which differs from A, then attempting to resume the hardware counter could start measuring an entirely different event, if the programmed value is valid at = all.=20 I see two possible ways to solve this: don't allow PMC-enabled processes (curproc->p_flag & P_HWPMC) to migrate outside of their PMC-compatible clus= ter, OR, have libpmc call cpuset(3) for the process, and bind it to compatible C= PUs for the duration of the measurement. I have not thought through either of t= hese approaches in detail, but both require building some list of "PMC-compatibl= e" CPU groups/clusters in the kernel. [1] https://www.cnx-software.com/2021/07/10/intel-alder-lake-hybrid-mobile-proc= essor-family-to-range-from-5w-to-55w-tdp/ [2] https://www.tomshardware.com/news/amd-patent-hybrid-cpu-rival-intel-raptor-= lake-cpu --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-257641-227>