Date: Fri, 18 Mar 2016 19:02:42 -0700 From: "K. Macy" <kmacy@freebsd.org> To: John Baldwin <jhb@freebsd.org> Cc: "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>, "arch@freebsd.org" <arch@freebsd.org> Subject: Re: Starting APs earlier during boot Message-ID: <CAHM0Q_O0ePon7=x_R-mWadKEcDHGALzkBKFxR=sLkN74jESyDQ@mail.gmail.com> In-Reply-To: <CAHM0Q_P_DZg9XpuP0ZRj0nBFhPxUK1VZ8SuEuaB4r9wmEnqJ2Q@mail.gmail.com> References: <1730061.8Ii36ORVKt@ralph.baldwin.cx> <2980696.6AEyEjetGn@ralph.baldwin.cx> <CAHM0Q_P_DZg9XpuP0ZRj0nBFhPxUK1VZ8SuEuaB4r9wmEnqJ2Q@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Mar 18, 2016 at 12:37 PM, K. Macy <kmacy@freebsd.org> wrote: > So none of these changes have been committed yet? > > I'm hitting hangs in USB on boot with recent HEAD and without having > investigating had thought this might be what exposed the problem. Never mind. It's yet another ZFS namespace deadlock. -M > > > On Friday, March 18, 2016, John Baldwin <jhb@freebsd.org> wrote: >> >> On Tuesday, February 16, 2016 12:50:22 PM John Baldwin wrote: >> > Currently the kernel bootstraps the non-boot processors fairly early in >> > the >> > SI_SUB_CPU SYSINIT. The APs then spin waiting to be "released". We >> > currently >> > release the APs as one of the last steps at SI_SUB_SMP. On the one hand >> > this >> > removes much of the need for synchronization while SYSINITs are running >> > since >> > SYSINITs basically assume they are single-threaded. However, it also >> > enforces >> > some odd quirks. Several places that deal with per-CPU resources have >> > to >> > split initialization up so that the BSP init happens in one SYSINIT and >> > the >> > initialization of the APs happens in a second SYSINIT at SI_SUB_SMP. >> > >> > Another issue that is becoming more prominent on x86 (and probably will >> > also >> > affect other platforms if it isn't already) is that to support working >> > interrupts for interrupt config hooks we bind all interrupts to the BSP >> > during >> > boot and only distribute them among other CPUs near the end at >> > SI_SUB_SMP. >> > This is especially problematic with drivers for modern hardware >> > allocating >> > num(CPUs) interrupts (hoping to use one per CPU). On x86 we have aboug >> > 190 >> > IDT vectors available for device interrupts, so in theory we should be >> > able to >> > tolerate a lot of drivers doing this (e.g. 60 drivers could allocate 3 >> > interrupts for every CPU and we should still be fine). However, if you >> > have, >> > say, 32 cores in a system, then you can only handle about 5 drivers >> > doing >> > this before you run out of vectors on CPU 0. >> > >> > Longer term we would also like to eventually have most drivers attach in >> > the >> > same environment during boot as during post-boot. Right now post-boot >> > is >> > quite different as all CPUs are running, interrupts work, etc. One of >> > the >> > goals of multipass support for new-bus is to help us get there by >> > probing >> > enough hardware to get timers working and starting the scheduler before >> > probing the rest of the devices. That goal isn't quite realized yet. >> > >> > However, we can run a slightly simpler version of our scheduler before >> > timers are working. In fact, sleep/wakeup work just fine fairly early >> > (we >> > allocate the necessary structures at SI_SUB_KMEM which is before the APs >> > are even started). Once idle threads are created and ready we could in >> > theory let the APs startup and run other threads. You just don't have >> > working >> > timeouts. OTOH, you can sort of simulate timeouts if you modify the >> > scheduler >> > to yield the CPU instead of blocking the thread for a sleep with a >> > timeout. >> > The effect would be for threads that do sleeps with a timeout to fall >> > back to >> > polling before timers are working. In practice, all of the early kernel >> > threads use sleeps without timeouts when idle so this doesn't really >> > matter. >> >> After some more testing, I've simplified the early scheduler a bit. It no >> longer tries to simulate timeouts by just keeping the thread runnable. >> Instead, >> a sleep with a timeout just panics. However, it does still permit sleeps >> with >> infinite sleeps. Some code that uses a timeout really wants a timeout >> (note >> that pause() has a hack to fallback to DELAY() internally if cold is true >> for >> this reason). Instead, my feeling is that any kthreads that need timeouts >> to >> work need to defer their startup until SI_SUB_KICK_SCHEDULER. >> >> > However, I'd like feedback on the general idea and if it is acceptable >> > I'd >> > like to coordinate testing with other platforms so this can go into the >> > tree. >> >> I don't think I've seen any objections? This does need more testing. I >> will >> update the patch to add a new EARLY_AP_STARTUP kernel option so this can >> be >> committed (but not yet enabled) allowing for easier testing (and allowing >> other platforms to catch up to x86). >> >> > The current changes are in the 'ap_startup' branch at >> > github/bsdjhb/freebsd. >> > You can view them here: >> > >> > https://github.com/bsdjhb/freebsd/compare/master...bsdjhb:ap_startup >> >> -- >> John Baldwin >> _______________________________________________ >> freebsd-arch@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-arch >> To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAHM0Q_O0ePon7=x_R-mWadKEcDHGALzkBKFxR=sLkN74jESyDQ>
