Date: Tue, 16 Feb 2016 12:50:22 -0800 From: John Baldwin <jhb@freebsd.org> To: arch@freebsd.org Subject: Starting APs earlier during boot Message-ID: <1730061.8Ii36ORVKt@ralph.baldwin.cx>
next in thread | raw e-mail | index | archive | help
Currently the kernel bootstraps the non-boot processors fairly early in the SI_SUB_CPU SYSINIT. The APs then spin waiting to be "released". We currently release the APs as one of the last steps at SI_SUB_SMP. On the one hand this removes much of the need for synchronization while SYSINITs are running since SYSINITs basically assume they are single-threaded. However, it also enforces some odd quirks. Several places that deal with per-CPU resources have to split initialization up so that the BSP init happens in one SYSINIT and the initialization of the APs happens in a second SYSINIT at SI_SUB_SMP. Another issue that is becoming more prominent on x86 (and probably will also affect other platforms if it isn't already) is that to support working interrupts for interrupt config hooks we bind all interrupts to the BSP during boot and only distribute them among other CPUs near the end at SI_SUB_SMP. This is especially problematic with drivers for modern hardware allocating num(CPUs) interrupts (hoping to use one per CPU). On x86 we have aboug 190 IDT vectors available for device interrupts, so in theory we should be able to tolerate a lot of drivers doing this (e.g. 60 drivers could allocate 3 interrupts for every CPU and we should still be fine). However, if you have, say, 32 cores in a system, then you can only handle about 5 drivers doing this before you run out of vectors on CPU 0. Longer term we would also like to eventually have most drivers attach in the same environment during boot as during post-boot. Right now post-boot is quite different as all CPUs are running, interrupts work, etc. One of the goals of multipass support for new-bus is to help us get there by probing enough hardware to get timers working and starting the scheduler before probing the rest of the devices. That goal isn't quite realized yet. However, we can run a slightly simpler version of our scheduler before timers are working. In fact, sleep/wakeup work just fine fairly early (we allocate the necessary structures at SI_SUB_KMEM which is before the APs are even started). Once idle threads are created and ready we could in theory let the APs startup and run other threads. You just don't have working timeouts. OTOH, you can sort of simulate timeouts if you modify the scheduler to yield the CPU instead of blocking the thread for a sleep with a timeout. The effect would be for threads that do sleeps with a timeout to fall back to polling before timers are working. In practice, all of the early kernel threads use sleeps without timeouts when idle so this doesn't really matter. I've implemented these changes and tested them for x86. For x86 at least AP startup needed some bits of the interrupt infrastructure in place, so I moved SI_SUB_SMP up to after SI_SUB_INTR but before SI_SUB_SOFTINTR. I modified the *sleep() and cv_*wait*() routines to not always bail if cold is true. Instead, sleeps without a timeout are permitted to sleep "normally". Sleeps with a timeout drop their interlock and yield the CPU (but remain runnable). Since APs are now fully running this means interrupts are now routed to all CPUs from the get go removing the need for the post-boot shuffle. This also resolves the issue of running out of IDT vectors on the boot CPU. I believe that adopting other platforms for this change should be relatively simple, but we should do that before committing the full patch. I do think that some parts of the patch (such as the changes to the sleep routines, and using SI_SUB_LAST instead of SI_SUB_SMP as a catch-all SYSINIT) can be committed now without breaking anything. However, I'd like feedback on the general idea and if it is acceptable I'd like to coordinate testing with other platforms so this can go into the tree. The current changes are in the 'ap_startup' branch at github/bsdjhb/freebsd. You can view them here: https://github.com/bsdjhb/freebsd/compare/master...bsdjhb:ap_startup -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1730061.8Ii36ORVKt>