From owner-freebsd-arch@freebsd.org Mon Mar 21 22:36:09 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 50F52AD83FA for ; Mon, 21 Mar 2016 22:36:09 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from mailman.ysv.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 3ECD4987 for ; Mon, 21 Mar 2016 22:36:09 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id 3A76EAD83F9; Mon, 21 Mar 2016 22:36:09 +0000 (UTC) Delivered-To: arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3A154AD83F7 for ; Mon, 21 Mar 2016 22:36:09 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id EE120986; Mon, 21 Mar 2016 22:36:08 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net [73.231.226.104]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id BC45BB9A3; Mon, 21 Mar 2016 18:36:07 -0400 (EDT) From: John Baldwin To: "K. Macy" , "arch@freebsd.org" Subject: Re: Starting APs earlier during boot Date: Mon, 21 Mar 2016 15:34:40 -0700 Message-ID: <4566552.97FFaNSfpg@ralph.baldwin.cx> User-Agent: KMail/4.14.3 (FreeBSD/10.2-STABLE; KDE/4.14.3; amd64; ; ) In-Reply-To: References: <1730061.8Ii36ORVKt@ralph.baldwin.cx> <2980696.6AEyEjetGn@ralph.baldwin.cx> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 21 Mar 2016 18:36:07 -0400 (EDT) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Mar 2016 22:36:09 -0000 On Friday, March 18, 2016 12:37:24 PM K. Macy wrote: > So none of these changes have been committed yet? > > I'm hitting hangs in USB on boot with recent HEAD and without having > investigating had thought this might be what exposed the problem. > > Thanks. I've committed some comestic ones (e.g., moving some SYSINITs from SI_SUB_SMP to SI_SUB_LAST), but nothing that should change actual behavior yet. > -M > > On Friday, March 18, 2016, John Baldwin wrote: > > > On Tuesday, February 16, 2016 12:50:22 PM John Baldwin wrote: > > > Currently the kernel bootstraps the non-boot processors fairly early in > > the > > > SI_SUB_CPU SYSINIT. The APs then spin waiting to be "released". We > > currently > > > release the APs as one of the last steps at SI_SUB_SMP. On the one hand > > this > > > removes much of the need for synchronization while SYSINITs are running > > since > > > SYSINITs basically assume they are single-threaded. However, it also > > enforces > > > some odd quirks. Several places that deal with per-CPU resources have to > > > split initialization up so that the BSP init happens in one SYSINIT and > > the > > > initialization of the APs happens in a second SYSINIT at SI_SUB_SMP. > > > > > > Another issue that is becoming more prominent on x86 (and probably will > > also > > > affect other platforms if it isn't already) is that to support working > > > interrupts for interrupt config hooks we bind all interrupts to the BSP > > during > > > boot and only distribute them among other CPUs near the end at > > SI_SUB_SMP. > > > This is especially problematic with drivers for modern hardware > > allocating > > > num(CPUs) interrupts (hoping to use one per CPU). On x86 we have aboug > > 190 > > > IDT vectors available for device interrupts, so in theory we should be > > able to > > > tolerate a lot of drivers doing this (e.g. 60 drivers could allocate 3 > > > interrupts for every CPU and we should still be fine). However, if you > > have, > > > say, 32 cores in a system, then you can only handle about 5 drivers doing > > > this before you run out of vectors on CPU 0. > > > > > > Longer term we would also like to eventually have most drivers attach in > > the > > > same environment during boot as during post-boot. Right now post-boot is > > > quite different as all CPUs are running, interrupts work, etc. One of > > the > > > goals of multipass support for new-bus is to help us get there by probing > > > enough hardware to get timers working and starting the scheduler before > > > probing the rest of the devices. That goal isn't quite realized yet. > > > > > > However, we can run a slightly simpler version of our scheduler before > > > timers are working. In fact, sleep/wakeup work just fine fairly early > > (we > > > allocate the necessary structures at SI_SUB_KMEM which is before the APs > > > are even started). Once idle threads are created and ready we could in > > > theory let the APs startup and run other threads. You just don't have > > working > > > timeouts. OTOH, you can sort of simulate timeouts if you modify the > > scheduler > > > to yield the CPU instead of blocking the thread for a sleep with a > > timeout. > > > The effect would be for threads that do sleeps with a timeout to fall > > back to > > > polling before timers are working. In practice, all of the early kernel > > > threads use sleeps without timeouts when idle so this doesn't really > > matter. > > > > After some more testing, I've simplified the early scheduler a bit. It no > > longer tries to simulate timeouts by just keeping the thread runnable. > > Instead, > > a sleep with a timeout just panics. However, it does still permit sleeps > > with > > infinite sleeps. Some code that uses a timeout really wants a timeout > > (note > > that pause() has a hack to fallback to DELAY() internally if cold is true > > for > > this reason). Instead, my feeling is that any kthreads that need timeouts > > to > > work need to defer their startup until SI_SUB_KICK_SCHEDULER. > > > > > However, I'd like feedback on the general idea and if it is acceptable > > I'd > > > like to coordinate testing with other platforms so this can go into the > > > tree. > > > > I don't think I've seen any objections? This does need more testing. I > > will > > update the patch to add a new EARLY_AP_STARTUP kernel option so this can be > > committed (but not yet enabled) allowing for easier testing (and allowing > > other platforms to catch up to x86). > > > > > The current changes are in the 'ap_startup' branch at > > github/bsdjhb/freebsd. > > > You can view them here: > > > > > > https://github.com/bsdjhb/freebsd/compare/master...bsdjhb:ap_startup > > > > -- > > John Baldwin > > _______________________________________________ > > freebsd-arch@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-arch > > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org > > " > > -- John Baldwin