Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 2 Apr 2007 16:42:06 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        Marcel Moolenaar <xcllnt@mac.com>
Cc:        Marcel Moolenaar <marcel@freebsd.org>, Perforce Change Reviews <perforce@freebsd.org>
Subject:   Re: PERFORCE change 117140 for review
Message-ID:  <200704021642.06901.jhb@freebsd.org>
In-Reply-To: <645BFA2D-3FC3-4AAB-ADCC-8D18431688E9@mac.com>
References:  <200704012152.l31LqHuB022635@repoman.freebsd.org> <200704021155.21453.jhb@freebsd.org> <645BFA2D-3FC3-4AAB-ADCC-8D18431688E9@mac.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Monday 02 April 2007 01:10:55 pm Marcel Moolenaar wrote:
> 
> On Apr 2, 2007, at 8:55 AM, John Baldwin wrote:
> 
> >> For example, to analyze machine checks on on pluto1, I disabled CPU0
> >> and CPU1 in succession to see if one of the CPUs was the cause of the
> >> MC. As such, CPU1 had to be the BSP when CPU0 was disabled. Luckily
> >> pluto1 is only a dual-CPU machine, so that disabling a CPU also stops
> >> SMP operation :-)
> >
> > FreeBSD CPU ID's != firmware CPU IDs.
> 
> It is important, and not only for identification, that the logical  
> CPU ID
> used by FreeBSD is the same as used by the firmware (if at all  
> possible).
> A different ID only causes confusion, especially when the firmware draws
> its IDs from the same domain. What is called CPU4 within FreeBSD may not
> be called CPU4 in the firmware, even though CPU4 may exist.

Given cores vs htts, etc. it would seem that what FreeBSD considers a CPU 
should be a purely logical object, and not required to be tied to whatever 
firmware, etc.  BTW, thanks for ignoring the discussion about this on arch 
over the past several years. :-/

> >   CPU ID's != local APIC ID's on x86 for example, nor
> > are they identical to the CPU indices in the hwprb on Alpha.
> 
> They are also not the same on ia64. In fact, on ia64 the APIC ID
> consists of 2 elements. The ACPI ID is exactly the kind of ID we
> want...

Then ia64 likely is not shutting down properly:

/*
 * Shutdown the system cleanly to prepare for reboot, halt, or power off.
 */
static void
boot(int howto)
{
	static int first_buf_printf = 1;

#if defined(SMP)
	/*
	 * Bind us to CPU 0 so that all shutdown code runs there.  Some
	 * systems don't shutdown properly (i.e., ACPI power off) if we
	 * run on another processor.
	 */
	mtx_lock_spin(&sched_lock);
	sched_bind(curthread, 0);
	mtx_unlock_spin(&sched_lock);
	KASSERT(PCPU_GET(cpuid) == 0, ("boot: not running on cpu 0"));
#endif

> >   FreeBSD CPU
> > ID's tend to not be sparse for example, because they are completely  
> > separate
> > from firmware IDs.
> 
> This statement is flawed. Sparseness is unavoidable when CPUs can be
> hot-plugged. While we should have a CPU ID that maps trivially to a
> bit field for masking purposes, there's no reason to allow sparse IDs
> to certain extend.

I said "tend to", not "required to".  We had a discussion about this when 
mp_maxid was settled upon for UMA rather than using MAXCPU.

> Both ACPI and Open Firmware have CPU IDs that map trivially to a mask
> and they tend to be dense. I see no reason to not use the firmware IDs
> as FreeBSD's notion of CPU ID. In fact, I see reasons not to create
> our own IDs. In such reason is the added overhead of mapping from one
> to the other during runtime.

ACPI ID's may be sparse, but they don't have to be.  They could start at 2 
billion if they wanted to (they are UInt32 and are only small right now due 
to the current whims of BIOS writes), and then they wouldn't fit into the 
current cpumask.  I see no problem with requiring the MD code to determine 
which CPUs exist from firmware/BIOS/etc. and then map that into a logical ID 
space that fits into that arch's cpumask_t while keeping cpumask_t simple if 
possible.

> I think that testing for CPU0 when we really want to know if we're
> running on the BSP is also flawed. On ia64 there's typically 1 BSP
> that is used to boot the machine, but each cell in a NUMA system
> has a monarch CPU that serves the purpose of the BSP for that cell.
> This means that some tests that check for the BSP may need to be
> changed to check for the monarch instead. Since there can obviously
> be only 1 CPU0, there will (ipso facto) be BSP-like processors with
> an ID != 0. It's therefore better not to assume "special powers" for
> CPU0 and instead check the PCPU for flags that corresponds 1-on-1
> with such powers.

I'm not a huge fan of assuming CPU 0 is BSP, but that is the current model 
under which FreeBSD operates.  I'm not sure what should happen for shutdown 
in NUMA systems for example as in the patch above.  If you have an 
alternative design, feel free to present it on arch@.  Until then, it would 
probably be best to at least conform to what is there now.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200704021642.06901.jhb>