Date: Fri, 14 Mar 2008 11:02:43 +0530 From: "Joseph Koshy" <joseph.koshy@gmail.com> To: "John Baldwin" <jhb@freebsd.org> Cc: freebsd-arch@freebsd.org Subject: Re: [PATCH] hwpmc(4) changes to use 'mp_maxid' instead of 'mp_ncpus'. Message-ID: <84dead720803132232k15c3aad7pe59875f0c84e0c27@mail.gmail.com> In-Reply-To: <200803131516.12284.jhb@freebsd.org> References: <20080313180805.GA83406@dragon.NUXI.org> <200803131516.12284.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Mar 14, 2008 at 12:46 AM, John Baldwin <jhb@freebsd.org> wrote: > On Thursday 13 March 2008 02:08:05 pm David O'Brien wrote: > > Hi folks, > > Some folks at Juniper have submitted these changes to hwpmc(4). > > I am sending them here for public review. > > > > Their thoughts are: > > The mp_ncpus refers to the count of the active CPU's. Where as > > mp_maxid refers to the count of all the cpus on the SMP. Using > > mp_ncpus in the cpu_id range-check of hwpmc module would lead to the > > assumption that all the active CPU's in the SMP are not interleaved. > > But for running on some platforms, the active and inactive cpus could > > be interleaved making hwpmc not work for the cpus whose cpu_id is > > greater than the active-cpu count. jhb> This is correct, but you need to handle CPUs that are absent. It might be jhb> sufficient to update pmc_cpu_is_disabled() in kern_pmc.c to check jhb> CPU_ABSENT(cpu) and claim the CPU is disabled if it is absent, but I'm not jhb> sure that will catch everything as that seems aimed at handling having a jhb> non-absent CPU halted (such as disabling HTT on i386). That is inline with the feedback (and sample patch to kern_pmc.c) that I had sent in to O'Brien. But there are other problems with the patch at various levels, probably not obvious to someone who is just looking at the kernel code. First, the relevance. My understanding is that these changes are for a proprietary SMP platform that uses non-mainstream (Tier3 or Tier4) CPUs. It so happens that Juniper decided to numbers CPUs 'sparsely' in their kernel variant and that is the motivation for this patch. IMO, as a policy, code changes for exotic hardware need to be maintained by vendors of said exotic hardware and not dumped on volunteers. Second, when I designed the PMCTools API I didn't consider that CPU numbers could be 'sparse'. [They aren't sparsely allocated on the i386/amd64---the code I looked at when I was designing PmcTools.] So there are assumptions sprinkled throughout userland that that the integers 0..hw.ncpus can select a valid CPU. While all that can be tracked down and changed, and documentation updated, it is still work that I would prefer to defer until there is a chance that someone in the general public can use it. I do need to prioritize how I spend my volunteer hours. Third, IFF we as a project are going to support 'sparse CPU numbering, I would like to see the form that takes before making changes to HWPMC and tools. For example: - How will userland and in-kernel modules find out which CPUs are physically present? Would there be a bitmask on the lines of today's machdep.hlt_cpus that we could query? Could we make the 'all_cpus' bitmask visible to userland? What happens when we start supporting systems with more than 32 processors? - Will sysctl hw.ncpus represent the count of present CPUs or will it represent the maximum CPU id? - How will userland distinguish between absent CPUs those that could be temporarily administratively disabled? - Are we going to support 'transient' CPUs [that come and go]? Why would we want sparse CPU numbering otherwise? Nit: 'mp_maxid' appears to be an index, not a count as claimed above. If support for sparse CPU numbering is something useful, I feel the correct sequence should be to discuss it here, add sparse CPU numbering to the base i386/amd64 kernels (say) first and then propagate the feature to auxiliary code like HWPMC and userland. Changing HWPMC and its userland before the base kernel itself changes does not seem to be the right thing to do. Regards, Koshy
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?84dead720803132232k15c3aad7pe59875f0c84e0c27>