From owner-svn-src-all@freebsd.org Mon May 16 05:26:15 2016 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 31AFDB3C1B0; Mon, 16 May 2016 05:26:15 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "vps1.elischer.org", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 044C51270; Mon, 16 May 2016 05:26:14 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from Julian-MBP3.local (ppp121-45-225-151.lns20.per1.internode.on.net [121.45.225.151]) (authenticated bits=0) by vps1.elischer.org (8.15.2/8.15.2) with ESMTPSA id u4G5Q1Br054637 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Sun, 15 May 2016 22:26:05 -0700 (PDT) (envelope-from julian@freebsd.org) Subject: Re: svn commit: r299746 - in head/sys: cddl/dev/dtrace cddl/dev/dtrace/amd64 cddl/dev/dtrace/i386 cddl/dev/dtrace/powerpc conf dev/acpica dev/hwpmc dev/hyperv/vmbus dev/xen/control geom/eli kern net sy... To: John Baldwin , src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org References: <201605141822.u4EIMqkx090971@repo.freebsd.org> From: Julian Elischer Message-ID: <0539c5c3-dce8-1659-d26e-ef136f256f10@freebsd.org> Date: Mon, 16 May 2016 13:25:56 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.0 MIME-Version: 1.0 In-Reply-To: <201605141822.u4EIMqkx090971@repo.freebsd.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 May 2016 05:26:15 -0000 On 15/05/2016 2:22 AM, John Baldwin wrote: > Author: jhb > Date: Sat May 14 18:22:52 2016 > New Revision: 299746 > URL: https://svnweb.freebsd.org/changeset/base/299746 > > Log: > Add an EARLY_AP_STARTUP option to start APs earlier during boot. > > Currently, Application Processors (non-boot CPUs) are started by > MD code at SI_SUB_CPU, but they are kept waiting in a "pen" until > SI_SUB_SMP at which point they are released to run kernel threads. > SI_SUB_SMP is one of the last SYSINIT levels, so APs don't enter > the scheduler and start running threads until fairly late in the > boot. > > This change moves SI_SUB_SMP up to just before software interrupt > threads are created allowing the APs to start executing kernel > threads much sooner (before any devices are probed). This allows > several initialization routines that need to perform initialization > on all CPUs to now perform that initialization in one step rather > than having to defer the AP initialization to a second SYSINIT run > at SI_SUB_SMP. It also permits all CPUs to be available for > handling interrupts before any devices are probed. > > This last feature fixes a problem on with interrupt vector exhaustion. > Specifically, in the old model all device interrupts were routed > onto the boot CPU during boot. Later after the APs were released at > SI_SUB_SMP, interrupts were redistributed across all CPUs. > > However, several drivers for multiqueue hardware allocate N interrupts > per CPU in the system. In a system with many CPUs, just a few drivers > doing this could exhaust the available pool of interrupt vectors on > the boot CPU as each driver was allocating N * mp_ncpu vectors on the > boot CPU. Now, drivers will allocate interrupts on their desired CPUs > during boot meaning that only N interrupts are allocated from the boot > CPU instead of N * mp_ncpu. > > Some other bits of code can also be simplified as smp_started is > now true much earlier and will now always be true for these bits of > code. This removes the need to treat the single-CPU boot environment > as a special case. > > As a transition aid, the new behavior is available under a new kernel > option (EARLY_AP_STARTUP). This will allow the option to be turned off > if need be during initial testing. I plan to enable this on x86 by > default in a followup commit in the next few days and to have all > platforms moved over before 11.0. Once the transition is complete, > the option will be removed along with the !EARLY_AP_STARTUP code. > > These changes have only been tested on x86. Other platform maintainers > are encouraged to port their architectures over as well. The main > things to check for are any uses of smp_started in MD code that can be > simplified and SI_SUB_SMP SYSINITs in MD code that can be removed in > the EARLY_AP_STARTUP case (e.g. the interrupt shuffling). > > PR: kern/199321 > Reviewed by: markj, gnn, kib > Sponsored by: Netflix > > Modified: > head/sys/cddl/dev/dtrace/amd64/dtrace_subr.c > head/sys/cddl/dev/dtrace/dtrace_load.c > head/sys/cddl/dev/dtrace/i386/dtrace_subr.c > head/sys/cddl/dev/dtrace/powerpc/dtrace_subr.c > head/sys/conf/NOTES > head/sys/conf/options > head/sys/dev/acpica/acpi.c > head/sys/dev/acpica/acpi_cpu.c > head/sys/dev/hwpmc/hwpmc_mod.c > head/sys/dev/hyperv/vmbus/hv_vmbus_drv_freebsd.c > head/sys/dev/xen/control/control.c > head/sys/geom/eli/g_eli.c > head/sys/kern/kern_clock.c > head/sys/kern/kern_clocksource.c > head/sys/kern/kern_cpu.c > head/sys/net/netisr.c > head/sys/sys/kernel.h > head/sys/x86/isa/clock.c > head/sys/x86/x86/intr_machdep.c > head/sys/x86/x86/local_apic.c > head/sys/x86/x86/mca.c > head/sys/x86/x86/mp_x86.c > > Modified: head/sys/cddl/dev/dtrace/amd64/dtrace_subr.c > ============================================================================== > --- head/sys/cddl/dev/dtrace/amd64/dtrace_subr.c Sat May 14 18:02:47 2016 (r299745) > +++ head/sys/cddl/dev/dtrace/amd64/dtrace_subr.c Sat May 14 18:22:52 2016 (r299746) > @@ -246,6 +246,26 @@ static uint64_t nsec_scale; > /* See below for the explanation of this macro. */ > #define SCALE_SHIFT 28 > > +static void > +dtrace_gethrtime_init_cpu(void *arg) > +{ > + uintptr_t cpu = (uintptr_t) arg; > + > + if (cpu == curcpu) > + tgt_cpu_tsc = rdtsc(); > + else > + hst_cpu_tsc = rdtsc(); > +} > + > +#ifdef EARLY_AP_STARTUP > +static void > +dtrace_gethrtime_init(void *arg) > +{ > + struct pcpu *pc; > + uint64_t tsc_f; > + cpuset_t map; > + int i; > +#else > /* > * Get the frequency and scale factor as early as possible so that they can be > * used for boot-time tracing. > @@ -254,6 +274,7 @@ static void > dtrace_gethrtime_init_early(void *arg) > { > uint64_t tsc_f; > +#endif > > /* > * Get TSC frequency known at this moment. > @@ -282,27 +303,18 @@ dtrace_gethrtime_init_early(void *arg) > * (terahertz) values; > */ > nsec_scale = ((uint64_t)NANOSEC << SCALE_SHIFT) / tsc_f; > +#ifndef EARLY_AP_STARTUP > } > SYSINIT(dtrace_gethrtime_init_early, SI_SUB_CPU, SI_ORDER_ANY, > dtrace_gethrtime_init_early, NULL); > > static void > -dtrace_gethrtime_init_cpu(void *arg) > -{ > - uintptr_t cpu = (uintptr_t) arg; > - > - if (cpu == curcpu) > - tgt_cpu_tsc = rdtsc(); > - else > - hst_cpu_tsc = rdtsc(); > -} > - > -static void > dtrace_gethrtime_init(void *arg) > { > struct pcpu *pc; > cpuset_t map; > int i; > +#endif > > /* The current CPU is the reference one. */ > sched_pin(); > @@ -323,8 +335,13 @@ dtrace_gethrtime_init(void *arg) > } > sched_unpin(); > } > +#ifdef EARLY_AP_STARTUP > +SYSINIT(dtrace_gethrtime_init, SI_SUB_DTRACE, SI_ORDER_ANY, > + dtrace_gethrtime_init, NULL); > +#else > SYSINIT(dtrace_gethrtime_init, SI_SUB_SMP, SI_ORDER_ANY, dtrace_gethrtime_init, > NULL); > +#endif > > /* > * DTrace needs a high resolution time function which can > > Modified: head/sys/cddl/dev/dtrace/dtrace_load.c > ============================================================================== > --- head/sys/cddl/dev/dtrace/dtrace_load.c Sat May 14 18:02:47 2016 (r299745) > +++ head/sys/cddl/dev/dtrace/dtrace_load.c Sat May 14 18:22:52 2016 (r299746) > @@ -22,6 +22,7 @@ > * > */ > > +#ifndef EARLY_AP_STARTUP > static void > dtrace_ap_start(void *dummy) > { > @@ -41,11 +42,15 @@ dtrace_ap_start(void *dummy) > } > > SYSINIT(dtrace_ap_start, SI_SUB_SMP, SI_ORDER_ANY, dtrace_ap_start, NULL); > +#endif > > static void > dtrace_load(void *dummy) > { > dtrace_provider_id_t id; > +#ifdef EARLY_AP_STARTUP > + int i; > +#endif > > /* Hook into the trap handler. */ > dtrace_trap_func = dtrace_trap; > @@ -142,8 +147,14 @@ dtrace_load(void *dummy) > > mutex_enter(&cpu_lock); > > +#ifdef EARLY_AP_STARTUP > + CPU_FOREACH(i) { > + (void) dtrace_cpu_setup(CPU_CONFIG, i); > + } > +#else > /* Setup the boot CPU */ > (void) dtrace_cpu_setup(CPU_CONFIG, 0); > +#endif > > mutex_exit(&cpu_lock); > > > Modified: head/sys/cddl/dev/dtrace/i386/dtrace_subr.c > ============================================================================== > --- head/sys/cddl/dev/dtrace/i386/dtrace_subr.c Sat May 14 18:02:47 2016 (r299745) > +++ head/sys/cddl/dev/dtrace/i386/dtrace_subr.c Sat May 14 18:22:52 2016 (r299746) > @@ -248,6 +248,26 @@ static uint64_t nsec_scale; > /* See below for the explanation of this macro. */ > #define SCALE_SHIFT 28 > > +static void > +dtrace_gethrtime_init_cpu(void *arg) > +{ > + uintptr_t cpu = (uintptr_t) arg; > + > + if (cpu == curcpu) > + tgt_cpu_tsc = rdtsc(); > + else > + hst_cpu_tsc = rdtsc(); > +} > + > +#ifdef EARLY_AP_STARTUP > +static void > +dtrace_gethrtime_init(void *arg) > +{ > + struct pcpu *pc; > + uint64_t tsc_f; > + cpuset_t map; > + int i; > +#else > /* > * Get the frequency and scale factor as early as possible so that they can be > * used for boot-time tracing. > @@ -256,6 +276,7 @@ static void > dtrace_gethrtime_init_early(void *arg) > { > uint64_t tsc_f; > +#endif > > /* > * Get TSC frequency known at this moment. > @@ -284,27 +305,18 @@ dtrace_gethrtime_init_early(void *arg) > * (terahertz) values; > */ > nsec_scale = ((uint64_t)NANOSEC << SCALE_SHIFT) / tsc_f; > +#ifndef EARLY_AP_STARTUP > } > SYSINIT(dtrace_gethrtime_init_early, SI_SUB_CPU, SI_ORDER_ANY, > dtrace_gethrtime_init_early, NULL); > > static void > -dtrace_gethrtime_init_cpu(void *arg) > -{ > - uintptr_t cpu = (uintptr_t) arg; > - > - if (cpu == curcpu) > - tgt_cpu_tsc = rdtsc(); > - else > - hst_cpu_tsc = rdtsc(); > -} > - > -static void > dtrace_gethrtime_init(void *arg) > { > cpuset_t map; > struct pcpu *pc; > int i; > +#endif > > /* The current CPU is the reference one. */ > sched_pin(); > @@ -325,8 +337,13 @@ dtrace_gethrtime_init(void *arg) > } > sched_unpin(); > } > +#ifdef EARLY_AP_STARTUP > +SYSINIT(dtrace_gethrtime_init, SI_SUB_DTRACE, SI_ORDER_ANY, > + dtrace_gethrtime_init, NULL); > +#else > SYSINIT(dtrace_gethrtime_init, SI_SUB_SMP, SI_ORDER_ANY, dtrace_gethrtime_init, > NULL); > +#endif > > /* > * DTrace needs a high resolution time function which can > > Modified: head/sys/cddl/dev/dtrace/powerpc/dtrace_subr.c > ============================================================================== > --- head/sys/cddl/dev/dtrace/powerpc/dtrace_subr.c Sat May 14 18:02:47 2016 (r299745) > +++ head/sys/cddl/dev/dtrace/powerpc/dtrace_subr.c Sat May 14 18:22:52 2016 (r299746) > @@ -218,8 +218,13 @@ dtrace_gethrtime_init(void *arg) > } > sched_unpin(); > } > - > -SYSINIT(dtrace_gethrtime_init, SI_SUB_SMP, SI_ORDER_ANY, dtrace_gethrtime_init, NULL); > +#ifdef EARLY_AP_STARTUP > +SYSINIT(dtrace_gethrtime_init, SI_SUB_DTRACE, SI_ORDER_ANY, > + dtrace_gethrtime_init, NULL); > +#else > +SYSINIT(dtrace_gethrtime_init, SI_SUB_SMP, SI_ORDER_ANY, dtrace_gethrtime_init, > + NULL); > +#endif > > /* > * DTrace needs a high resolution time function which can > > Modified: head/sys/conf/NOTES > ============================================================================== > --- head/sys/conf/NOTES Sat May 14 18:02:47 2016 (r299745) > +++ head/sys/conf/NOTES Sat May 14 18:22:52 2016 (r299746) > @@ -223,6 +223,12 @@ options SCHED_STATS > # Mandatory: > options SMP # Symmetric MultiProcessor Kernel > > +# EARLY_AP_STARTUP releases the Application Processors earlier in the > +# kernel startup process (before devices are probed) rather than at the > +# end. This is a temporary option for use during the transition from > +# late to early AP startup. > +options EARLY_AP_STARTUP > + > # MAXCPU defines the maximum number of CPUs that can boot in the system. > # A default value should be already present, for every architecture. > options MAXCPU=32 > > Modified: head/sys/conf/options > ============================================================================== > --- head/sys/conf/options Sat May 14 18:02:47 2016 (r299745) > +++ head/sys/conf/options Sat May 14 18:22:52 2016 (r299746) > @@ -620,6 +620,7 @@ DEBUG_MEMGUARD opt_vm.h > DEBUG_REDZONE opt_vm.h > > # Standard SMP options > +EARLY_AP_STARTUP opt_global.h > SMP opt_global.h > > # Size of the kernel message buffer > > Modified: head/sys/dev/acpica/acpi.c > ============================================================================== > --- head/sys/dev/acpica/acpi.c Sat May 14 18:02:47 2016 (r299745) > +++ head/sys/dev/acpica/acpi.c Sat May 14 18:22:52 2016 (r299746) > @@ -2856,11 +2856,18 @@ acpi_EnterSleepState(struct acpi_softc * > stop_all_proc(); > EVENTHANDLER_INVOKE(power_suspend); > > +#ifdef EARLY_AP_STARTUP > + MPASS(mp_ncpus == 1 || smp_started); > + thread_lock(curthread); > + sched_bind(curthread, 0); > + thread_unlock(curthread); > +#else > if (smp_started) { > thread_lock(curthread); > sched_bind(curthread, 0); > thread_unlock(curthread); > } > +#endif > > /* > * Be sure to hold Giant across DEVICE_SUSPEND/RESUME since non-MPSAFE > @@ -2991,11 +2998,17 @@ backout: > > mtx_unlock(&Giant); > > +#ifdef EARLY_AP_STARTUP > + thread_lock(curthread); > + sched_unbind(curthread); > + thread_unlock(curthread); > +#else > if (smp_started) { > thread_lock(curthread); > sched_unbind(curthread); > thread_unlock(curthread); > } > +#endif > > resume_all_proc(); > > > Modified: head/sys/dev/acpica/acpi_cpu.c > ============================================================================== > --- head/sys/dev/acpica/acpi_cpu.c Sat May 14 18:02:47 2016 (r299745) > +++ head/sys/dev/acpica/acpi_cpu.c Sat May 14 18:22:52 2016 (r299746) > @@ -439,8 +439,12 @@ acpi_cpu_postattach(void *unused __unuse > free(devices, M_TEMP); > > if (attached) { > +#ifdef EARLY_AP_STARTUP > + acpi_cpu_startup(NULL); > +#else > /* Queue post cpu-probing task handler */ > AcpiOsExecute(OSL_NOTIFY_HANDLER, acpi_cpu_startup, NULL); > +#endif > } > } > > > Modified: head/sys/dev/hwpmc/hwpmc_mod.c > ============================================================================== > --- head/sys/dev/hwpmc/hwpmc_mod.c Sat May 14 18:02:47 2016 (r299745) > +++ head/sys/dev/hwpmc/hwpmc_mod.c Sat May 14 18:22:52 2016 (r299746) > @@ -334,7 +334,11 @@ static moduledata_t pmc_mod = { > &pmc_syscall_mod > }; > > +#ifdef EARLY_AP_STARTUP > +DECLARE_MODULE(pmc, pmc_mod, SI_SUB_SYSCALLS, SI_ORDER_ANY); > +#else > DECLARE_MODULE(pmc, pmc_mod, SI_SUB_SMP, SI_ORDER_ANY); > +#endif > MODULE_VERSION(pmc, PMC_VERSION); > > #ifdef HWPMC_DEBUG > > Modified: head/sys/dev/hyperv/vmbus/hv_vmbus_drv_freebsd.c > ============================================================================== > --- head/sys/dev/hyperv/vmbus/hv_vmbus_drv_freebsd.c Sat May 14 18:02:47 2016 (r299745) > +++ head/sys/dev/hyperv/vmbus/hv_vmbus_drv_freebsd.c Sat May 14 18:22:52 2016 (r299746) > @@ -519,6 +519,7 @@ vmbus_attach(device_t dev) > device_printf(dev, "VMBUS: attach dev: %p\n", dev); > vmbus_devp = dev; > > +#ifndef EARLY_AP_STARTUP > /* > * If the system has already booted and thread > * scheduling is possible indicated by the global > @@ -526,6 +527,7 @@ vmbus_attach(device_t dev) > * initialization directly. > */ > if (!cold) > +#endif > vmbus_bus_init(); > > bus_generic_probe(dev); > @@ -538,6 +540,7 @@ vmbus_init(void) > if (vm_guest != VM_GUEST_HV) > return; > > +#ifndef EARLY_AP_STARTUP > /* > * If the system has already booted and thread > * scheduling is possible, as indicated by the > @@ -545,6 +548,7 @@ vmbus_init(void) > * initialization directly. > */ > if (!cold) > +#endif > vmbus_bus_init(); > } > > @@ -611,6 +615,9 @@ vmbus_modevent(module_t mod, int what, v > switch (what) { > > case MOD_LOAD: > +#ifdef EARLY_AP_STARTUP > + vmbus_init(); > +#endif > vmbus_mod_load(); > break; > case MOD_UNLOAD: > @@ -649,6 +656,7 @@ DRIVER_MODULE(vmbus, acpi, vmbus_driver, > MODULE_DEPEND(vmbus, acpi, 1, 1, 1); > MODULE_VERSION(vmbus, 1); > > +#ifndef EARLY_AP_STARTUP > /* We want to be started after SMP is initialized */ > SYSINIT(vmb_init, SI_SUB_SMP + 1, SI_ORDER_FIRST, vmbus_init, NULL); > - > +#endif > > Modified: head/sys/dev/xen/control/control.c > ============================================================================== > --- head/sys/dev/xen/control/control.c Sat May 14 18:02:47 2016 (r299745) > +++ head/sys/dev/xen/control/control.c Sat May 14 18:22:52 2016 (r299746) > @@ -202,11 +202,18 @@ xctrl_suspend() > stop_all_proc(); > EVENTHANDLER_INVOKE(power_suspend); > > +#ifdef EARLY_AP_STARTUP > + MPASS(mp_ncpus == 1 || smp_started); > + thread_lock(curthread); > + sched_bind(curthread, 0); > + thread_unlock(curthread); > +#else > if (smp_started) { > thread_lock(curthread); > sched_bind(curthread, 0); > thread_unlock(curthread); > } > +#endif > KASSERT((PCPU_GET(cpuid) == 0), ("Not running on CPU#0")); > > /* > @@ -227,6 +234,17 @@ xctrl_suspend() > } > > #ifdef SMP > +#ifdef EARLY_AP_STARTUP > + /* > + * Suspend other CPUs. This prevents IPIs while we > + * are resuming, and will allow us to reset per-cpu > + * vcpu_info on resume. > + */ > + cpu_suspend_map = all_cpus; > + CPU_CLR(PCPU_GET(cpuid), &cpu_suspend_map); > + if (!CPU_EMPTY(&cpu_suspend_map)) > + suspend_cpus(cpu_suspend_map); > +#else > CPU_ZERO(&cpu_suspend_map); /* silence gcc */ > if (smp_started) { > /* > @@ -240,6 +258,7 @@ xctrl_suspend() > suspend_cpus(cpu_suspend_map); > } > #endif > +#endif > > /* > * Prevent any races with evtchn_interrupt() handler. > @@ -285,11 +304,17 @@ xctrl_suspend() > timecounter->tc_get_timecount(timecounter); > inittodr(time_second); > > +#ifdef EARLY_AP_STARTUP > + thread_lock(curthread); > + sched_unbind(curthread); > + thread_unlock(curthread); > +#else > if (smp_started) { > thread_lock(curthread); > sched_unbind(curthread); > thread_unlock(curthread); > } > +#endif > > resume_all_proc(); > > > Modified: head/sys/geom/eli/g_eli.c > ============================================================================== > --- head/sys/geom/eli/g_eli.c Sat May 14 18:02:47 2016 (r299745) > +++ head/sys/geom/eli/g_eli.c Sat May 14 18:22:52 2016 (r299746) > @@ -479,7 +479,9 @@ g_eli_worker(void *arg) > > wr = arg; > sc = wr->w_softc; > -#ifdef SMP > +#ifdef EARLY_AP_STARTUP > + MPASS(!sc->sc_cpubind || smp_started); > +#elif defined(SMP) > /* Before sched_bind() to a CPU, wait for all CPUs to go on-line. */ > if (sc->sc_cpubind) { > while (!smp_started) > > Modified: head/sys/kern/kern_clock.c > ============================================================================== > --- head/sys/kern/kern_clock.c Sat May 14 18:02:47 2016 (r299745) > +++ head/sys/kern/kern_clock.c Sat May 14 18:22:52 2016 (r299746) > @@ -391,6 +391,10 @@ static void > initclocks(dummy) > void *dummy; > { > +#ifdef EARLY_AP_STARTUP > + struct proc *p; > + struct thread *td; > +#endif > register int i; > > /* > @@ -415,6 +419,35 @@ initclocks(dummy) > * sign problems sooner. > */ > ticks = INT_MAX - (hz * 10 * 60); > + > +#ifdef EARLY_AP_STARTUP > + /* > + * Fixup the tick counts in any blocked or sleeping threads to > + * account for the jump above. > + */ > + sx_slock(&allproc_lock); > + FOREACH_PROC_IN_SYSTEM(p) { > + PROC_LOCK(p); > + if (p->p_state == PRS_NEW) { > + PROC_UNLOCK(p); > + continue; > + } > + FOREACH_THREAD_IN_PROC(p, td) { > + thread_lock(td); > + if (TD_ON_LOCK(td)) { > + MPASS(td->td_blktick == 0); > + td->td_blktick = ticks; > + } > + if (TD_ON_SLEEPQ(td)) { > + MPASS(td->td_slptick == 0); > + td->td_slptick = ticks; > + } > + thread_unlock(td); > + } > + PROC_UNLOCK(p); > + } > + sx_sunlock(&allproc_lock); > +#endif > } > > /* > > Modified: head/sys/kern/kern_clocksource.c > ============================================================================== > --- head/sys/kern/kern_clocksource.c Sat May 14 18:02:47 2016 (r299745) > +++ head/sys/kern/kern_clocksource.c Sat May 14 18:22:52 2016 (r299746) > @@ -322,9 +322,16 @@ timercb(struct eventtimer *et, void *arg > curcpu, (int)(now >> 32), (u_int)(now & 0xffffffff)); > > #ifdef SMP > +#ifdef EARLY_AP_STARTUP > + MPASS(mp_ncpus == 1 || smp_started); > +#endif > /* Prepare broadcasting to other CPUs for non-per-CPU timers. */ > bcast = 0; > +#ifdef EARLY_AP_STARTUP > + if ((et->et_flags & ET_FLAGS_PERCPU) == 0) { > +#else > if ((et->et_flags & ET_FLAGS_PERCPU) == 0 && smp_started) { > +#endif > CPU_FOREACH(cpu) { > state = DPCPU_ID_PTR(cpu, timerstate); > ET_HW_LOCK(state); > @@ -485,12 +492,17 @@ configtimer(int start) > nexttick = next; > else > nexttick = -1; > +#ifdef EARLY_AP_STARTUP > + MPASS(mp_ncpus == 1 || smp_started); > +#endif > CPU_FOREACH(cpu) { > state = DPCPU_ID_PTR(cpu, timerstate); > state->now = now; > +#ifndef EARLY_AP_STARTUP > if (!smp_started && cpu != CPU_FIRST()) > state->nextevent = SBT_MAX; > else > +#endif > state->nextevent = next; > if (periodic) > state->nexttick = next; > @@ -513,8 +525,13 @@ configtimer(int start) > } > ET_HW_UNLOCK(DPCPU_PTR(timerstate)); > #ifdef SMP > +#ifdef EARLY_AP_STARTUP > + /* If timer is global we are done. */ > + if ((timer->et_flags & ET_FLAGS_PERCPU) == 0) { > +#else > /* If timer is global or there is no other CPUs yet - we are done. */ > if ((timer->et_flags & ET_FLAGS_PERCPU) == 0 || !smp_started) { > +#endif > critical_exit(); > return; > } > > Modified: head/sys/kern/kern_cpu.c > ============================================================================== > --- head/sys/kern/kern_cpu.c Sat May 14 18:02:47 2016 (r299745) > +++ head/sys/kern/kern_cpu.c Sat May 14 18:22:52 2016 (r299746) > @@ -259,6 +259,9 @@ cf_set_method(device_t dev, const struct > CF_MTX_LOCK(&sc->lock); > > #ifdef SMP > +#ifdef EARLY_AP_STARTUP > + MPASS(mp_ncpus == 1 || smp_started); > +#else > /* > * If still booting and secondary CPUs not started yet, don't allow > * changing the frequency until they're online. This is because we > @@ -271,6 +274,7 @@ cf_set_method(device_t dev, const struct > error = ENXIO; > goto out; > } > +#endif > #endif /* SMP */ > > /* > > Modified: head/sys/net/netisr.c > ============================================================================== > --- head/sys/net/netisr.c Sat May 14 18:02:47 2016 (r299745) > +++ head/sys/net/netisr.c Sat May 14 18:22:52 2016 (r299746) > @@ -1119,6 +1119,10 @@ netisr_start_swi(u_int cpuid, struct pcp > static void > netisr_init(void *arg) > { > +#ifdef EARLY_AP_STARTUP > + struct pcpu *pc; > +#endif > + > KASSERT(curcpu == 0, ("%s: not on CPU 0", __func__)); > > NETISR_LOCK_INIT(); > @@ -1149,10 +1153,20 @@ netisr_init(void *arg) > netisr_bindthreads = 0; > } > #endif > + > +#ifdef EARLY_AP_STARTUP > + STAILQ_FOREACH(pc, &cpuhead, pc_allcpu) { > + if (nws_count >= netisr_maxthreads) > + break; > + netisr_start_swi(pc->pc_cpuid, pc); > + } > +#else > netisr_start_swi(curcpu, pcpu_find(curcpu)); > +#endif > } > SYSINIT(netisr_init, SI_SUB_SOFTINTR, SI_ORDER_FIRST, netisr_init, NULL); > > +#ifndef EARLY_AP_STARTUP > /* > * Start worker threads for additional CPUs. No attempt to gracefully handle > * work reassignment, we don't yet support dynamic reconfiguration. > @@ -1172,6 +1186,7 @@ netisr_start(void *arg) > } > } > SYSINIT(netisr_start, SI_SUB_SMP, SI_ORDER_MIDDLE, netisr_start, NULL); > +#endif > > /* > * Sysctl monitoring for netisr: query a list of registered protocols. > > Modified: head/sys/sys/kernel.h > ============================================================================== > --- head/sys/sys/kernel.h Sat May 14 18:02:47 2016 (r299745) > +++ head/sys/sys/kernel.h Sat May 14 18:22:52 2016 (r299746) > @@ -118,7 +118,10 @@ enum sysinit_sub_id { > SI_SUB_SCHED_IDLE = 0x2600000, /* required idle procs */ > SI_SUB_MBUF = 0x2700000, /* mbuf subsystem */ > SI_SUB_INTR = 0x2800000, /* interrupt threads */ > - SI_SUB_SOFTINTR = 0x2800001, /* start soft interrupt thread */ > +#ifdef EARLY_AP_STARTUP > + SI_SUB_SMP = 0x2900000, /* start the APs*/ > +#endif > + SI_SUB_SOFTINTR = 0x2A00000, /* start soft interrupt thread */ > SI_SUB_DEVFS = 0x2F00000, /* devfs ready for devices */ > SI_SUB_INIT_IF = 0x3000000, /* prep for net interfaces */ > SI_SUB_NETGRAPH = 0x3010000, /* Let Netgraph initialize */ > @@ -154,7 +157,9 @@ enum sysinit_sub_id { > SI_SUB_KTHREAD_BUF = 0xea00000, /* buffer daemon*/ > SI_SUB_KTHREAD_UPDATE = 0xec00000, /* update daemon*/ > SI_SUB_KTHREAD_IDLE = 0xee00000, /* idle procs*/ > +#ifndef EARLY_AP_STARTUP > SI_SUB_SMP = 0xf000000, /* start the APs*/ > +#endif > SI_SUB_RACCTD = 0xf100000, /* start racctd*/ > SI_SUB_LAST = 0xfffffff /* final initialization */ > }; > > Modified: head/sys/x86/isa/clock.c > ============================================================================== > --- head/sys/x86/isa/clock.c Sat May 14 18:02:47 2016 (r299745) > +++ head/sys/x86/isa/clock.c Sat May 14 18:22:52 2016 (r299746) > @@ -475,8 +475,27 @@ startrtclock() > void > cpu_initclocks(void) > { > +#ifdef EARLY_AP_STARTUP > + struct thread *td; > + int i; > > + td = curthread; > cpu_initclocks_bsp(); > + CPU_FOREACH(i) { > + if (i == 0) > + continue; > + thread_lock(td); > + sched_bind(td, i); > + thread_unlock(td); > + cpu_initclocks_ap(); > + } > + thread_lock(td); > + if (sched_is_bound(td)) > + sched_unbind(td); > + thread_unlock(td); > +#else > + cpu_initclocks_bsp(); > +#endif > } > > static int > > Modified: head/sys/x86/x86/intr_machdep.c > ============================================================================== > --- head/sys/x86/x86/intr_machdep.c Sat May 14 18:02:47 2016 (r299745) > +++ head/sys/x86/x86/intr_machdep.c Sat May 14 18:22:52 2016 (r299746) > @@ -77,7 +77,7 @@ static struct mtx intr_table_lock; > static struct mtx intrcnt_lock; > static TAILQ_HEAD(pics_head, pic) pics; > > -#ifdef SMP > +#if defined(SMP) && !defined(EARLY_AP_STARTUP) > static int assign_cpu; > #endif > > @@ -320,11 +320,16 @@ intr_assign_cpu(void *arg, int cpu) > struct intsrc *isrc; > int error; > > +#ifdef EARLY_AP_STARTUP > + MPASS(mp_ncpus == 1 || smp_started); > + if (cpu != NOCPU) { > +#else > /* > * Don't do anything during early boot. We will pick up the > * assignment once the APs are started. > */ > if (assign_cpu && cpu != NOCPU) { > +#endif > isrc = arg; > mtx_lock(&intr_table_lock); > error = isrc->is_pic->pic_assign_cpu(isrc, cpu_apic_ids[cpu]); > @@ -502,9 +507,13 @@ intr_next_cpu(void) > { > u_int apic_id; > > +#ifdef EARLY_AP_STARTUP > + MPASS(mp_ncpus == 1 || smp_started); > +#else > /* Leave all interrupts on the BSP during boot. */ > if (!assign_cpu) > return (PCPU_GET(apic_id)); > +#endif > > mtx_lock_spin(&icu_lock); > apic_id = cpu_apic_ids[current_cpu]; > @@ -546,6 +555,7 @@ intr_add_cpu(u_int cpu) > CPU_SET(cpu, &intr_cpus); > } > > +#ifndef EARLY_AP_STARTUP > /* > * Distribute all the interrupt sources among the available CPUs once the > * AP's have been launched. > @@ -586,6 +596,7 @@ intr_shuffle_irqs(void *arg __unused) > } > SYSINIT(intr_shuffle_irqs, SI_SUB_SMP, SI_ORDER_SECOND, intr_shuffle_irqs, > NULL); > +#endif > #else > /* > * Always route interrupts to the current processor in the UP case. > > Modified: head/sys/x86/x86/local_apic.c > ============================================================================== > --- head/sys/x86/x86/local_apic.c Sat May 14 18:02:47 2016 (r299745) > +++ head/sys/x86/x86/local_apic.c Sat May 14 18:22:52 2016 (r299746) > @@ -749,6 +749,10 @@ native_lapic_enable_pmc(void) > > lvts[APIC_LVT_PMC].lvt_masked = 0; > > +#ifdef EARLY_AP_STARTUP > + MPASS(mp_ncpus == 1 || smp_started); > + smp_rendezvous(NULL, lapic_update_pmc, NULL, NULL); > +#else > #ifdef SMP > /* > * If hwpmc was loaded at boot time then the APs may not be > @@ -760,6 +764,7 @@ native_lapic_enable_pmc(void) > else > #endif > lapic_update_pmc(NULL); > +#endif > return (1); > #else > return (0); > > Modified: head/sys/x86/x86/mca.c > ============================================================================== > --- head/sys/x86/x86/mca.c Sat May 14 18:02:47 2016 (r299745) > +++ head/sys/x86/x86/mca.c Sat May 14 18:22:52 2016 (r299746) > @@ -726,7 +726,11 @@ mca_startup(void *dummy) > > callout_reset(&mca_timer, mca_ticks * hz, mca_periodic_scan, NULL); > } > +#ifdef EARLY_AP_STARTUP > +SYSINIT(mca_startup, SI_SUB_KICK_SCHEDULER, SI_ORDER_ANY, mca_startup, NULL); > +#else > SYSINIT(mca_startup, SI_SUB_SMP, SI_ORDER_ANY, mca_startup, NULL); > +#endif > > #ifdef DEV_APIC > static void > > Modified: head/sys/x86/x86/mp_x86.c > ============================================================================== > --- head/sys/x86/x86/mp_x86.c Sat May 14 18:02:47 2016 (r299745) > +++ head/sys/x86/x86/mp_x86.c Sat May 14 18:22:52 2016 (r299746) > @@ -933,8 +933,10 @@ init_secondary_tail(void) > while (atomic_load_acq_int(&smp_started) == 0) > ia32_pause(); > > +#ifndef EARLY_AP_STARTUP > /* Start per-CPU event timers. */ > cpu_initclocks_ap(); > +#endif > > sched_throw(NULL); > > John, This feels as though it should be settable with a tuneable variable. Can you think of a good way to do this other than having two sysinit entries and making the tuneable "enable" the right one? There is no tuneable/sysinit interaction otherwise.