Date: Fri, 2 Dec 2016 13:39:44 +0800 From: Sepherosa Ziehau <sepherosa@gmail.com> To: John Baldwin <jhb@freebsd.org> Cc: "freebsd-current@freebsd.org" <current@freebsd.org> Subject: Re: Please test EARLY_AP_STARTUP Message-ID: <CAMOc5cyNnDc2aaB3M0be_4a6a5VmL-k=drTqBFyo2qU2gC3qqw@mail.gmail.com> In-Reply-To: <CAMOc5cyGLOcfNp%2Bwn3o16FCEggci2wyr6eCN==npfnUU1ThZJA@mail.gmail.com> References: <7005233.xZtqgRZ2t6@ralph.baldwin.cx> <CAMOc5cz_KzkJnh-QDTWhpr%2BF0nOPcp9YF3YLseUJ=qQQvW_-EA@mail.gmail.com> <CAMOc5cxkRK5dw4R2gHhvOX4rnR%2BadKCm0XcaReZL5s=Mw-%2Baaw@mail.gmail.com> <2914745.0k996KCSkq@ralph.baldwin.cx> <CAMOc5cyGLOcfNp%2Bwn3o16FCEggci2wyr6eCN==npfnUU1ThZJA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Dec 2, 2016 at 12:53 PM, Sepherosa Ziehau <sepherosa@gmail.com> wrote: > On Fri, Dec 2, 2016 at 1:49 AM, John Baldwin <jhb@freebsd.org> wrote: >> On Thursday, December 01, 2016 01:53:29 PM Sepherosa Ziehau wrote: >>> On Wed, Nov 30, 2016 at 9:59 AM, Sepherosa Ziehau <sepherosa@gmail.com> wrote: >>> >>> After fdc is disabled and hyperv/storvsc is fixed, it seems to boot >>> >>> fine, except a long delay (28~30seconds) here: >>> >>> .... >>> >>> Timecounters tick every 1.000 msec >>> >>> ----- >>> >>> 28 ~ 30 seconds delay >>> >>> ----- >>> >>> vlan: initialized, using hash tables with chaining >>> >>> .... >>> >>> >>> >>> I have the bootverbose dmesg here: >>> >>> https://people.freebsd.org/~sephe/dmesg_earlyap.txt >>> >>> >>> >>> I booted 10 times, only one boot does not suffer this 30 seconds >>> >>> delay. It sounds like some races to me. Any hints? >>> >> >>> >> It is likely a race as we start running things sooner now, yes. Can you >>> >> break into DDB during the hang and see what thread0 is waiting on? If >>> >> it is in the interrupt hooks you can use 'show conifhk' in DDB to see the >>> >> list of pending interrupt hooks. That provides a list of candidate drivers >>> >> to inspect (e.g. stack traces of relevant kthreads) for what is actually >>> >> waiting (and what it is waiting on) >>> > >>> > Just tried, but I failed to break into DDB during the 30 seconds >>> > delay. DDB was entered after the 30 seconds delay, though I press the >>> > break key when the delay started. >>> >>> I tried add VERBOSE_SYSINIT option in order to get a rough location of >>> this delay, but the system boots just fine w/ VERBOSE_SYSINIT option, >>> sigh. >> >> You could add KTR_PROC tracing and use 'show ktr' in DDB when you break in after the >> 30 second delay to see what it was doing during the delay perhaps? > > I have narrowed it down by patching the VERBOSE_SYSINIT: the > kthread_start(&deadlkres_kd) introduces the 30 seconds delay, i.e. > SYSINIT(deadlkres, SI_SUB_CLOCKS, SI_ORDER_ANY, kthread_start, > &deadlkres_kd) blocks for 30 seconds. I commented out the DEADLKRES option, now the delay happens randomly, sometimes even before a VERBOSE_SYSINIT entry logging. But the delay always happens after inittimecounter(0), and before the SI_SUB_INT_CONFIG_HOOKS. I didn't notice useful information from DDB 'show ktr', once it thawed. Anything I should be aware of in the KTR_PROC ktrlog? Thanks, sephe -- Tomorrow Will Never Die
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAMOc5cyNnDc2aaB3M0be_4a6a5VmL-k=drTqBFyo2qU2gC3qqw>