Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 2 Dec 2016 13:39:44 +0800
From:      Sepherosa Ziehau <sepherosa@gmail.com>
To:        John Baldwin <jhb@freebsd.org>
Cc:        "freebsd-current@freebsd.org" <current@freebsd.org>
Subject:   Re: Please test EARLY_AP_STARTUP
Message-ID:  <CAMOc5cyNnDc2aaB3M0be_4a6a5VmL-k=drTqBFyo2qU2gC3qqw@mail.gmail.com>
In-Reply-To: <CAMOc5cyGLOcfNp%2Bwn3o16FCEggci2wyr6eCN==npfnUU1ThZJA@mail.gmail.com>
References:  <7005233.xZtqgRZ2t6@ralph.baldwin.cx> <CAMOc5cz_KzkJnh-QDTWhpr%2BF0nOPcp9YF3YLseUJ=qQQvW_-EA@mail.gmail.com> <CAMOc5cxkRK5dw4R2gHhvOX4rnR%2BadKCm0XcaReZL5s=Mw-%2Baaw@mail.gmail.com> <2914745.0k996KCSkq@ralph.baldwin.cx> <CAMOc5cyGLOcfNp%2Bwn3o16FCEggci2wyr6eCN==npfnUU1ThZJA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Dec 2, 2016 at 12:53 PM, Sepherosa Ziehau <sepherosa@gmail.com> wrote:
> On Fri, Dec 2, 2016 at 1:49 AM, John Baldwin <jhb@freebsd.org> wrote:
>> On Thursday, December 01, 2016 01:53:29 PM Sepherosa Ziehau wrote:
>>> On Wed, Nov 30, 2016 at 9:59 AM, Sepherosa Ziehau <sepherosa@gmail.com> wrote:
>>> >>> After fdc is disabled and hyperv/storvsc is fixed, it seems to boot
>>> >>> fine, except a long delay (28~30seconds) here:
>>> >>> ....
>>> >>> Timecounters tick every 1.000 msec
>>> >>> -----
>>> >>> 28 ~ 30 seconds delay
>>> >>> -----
>>> >>> vlan: initialized, using hash tables with chaining
>>> >>> ....
>>> >>>
>>> >>> I have the bootverbose dmesg here:
>>> >>> https://people.freebsd.org/~sephe/dmesg_earlyap.txt
>>> >>>
>>> >>> I booted 10 times, only one boot does not suffer this 30 seconds
>>> >>> delay.  It sounds like some races to me.  Any hints?
>>> >>
>>> >> It is likely a race as we start running things sooner now, yes.  Can you
>>> >> break into DDB during the hang and see what thread0 is waiting on?  If
>>> >> it is in the interrupt hooks you can use 'show conifhk' in DDB to see the
>>> >> list of pending interrupt hooks.  That provides a list of candidate drivers
>>> >> to inspect (e.g. stack traces of relevant kthreads) for what is actually
>>> >> waiting (and what it is waiting on)
>>> >
>>> > Just tried, but I failed to break into DDB during the 30 seconds
>>> > delay.  DDB was entered after the 30 seconds delay, though I press the
>>> > break key when the delay started.
>>>
>>> I tried add VERBOSE_SYSINIT option in order to get a rough location of
>>> this delay, but the system boots just fine w/ VERBOSE_SYSINIT option,
>>> sigh.
>>
>> You could add KTR_PROC tracing and use 'show ktr' in DDB when you break in after the
>> 30 second delay to see what it was doing during the delay perhaps?
>
> I have narrowed it down by patching the VERBOSE_SYSINIT: the
> kthread_start(&deadlkres_kd) introduces the 30 seconds delay, i.e.
> SYSINIT(deadlkres, SI_SUB_CLOCKS, SI_ORDER_ANY, kthread_start,
> &deadlkres_kd) blocks for 30 seconds.

I commented out the DEADLKRES option, now the delay happens randomly,
sometimes even before a VERBOSE_SYSINIT entry logging.  But the delay
always happens after inittimecounter(0), and before the
SI_SUB_INT_CONFIG_HOOKS.

I didn't notice useful information from DDB 'show ktr', once it
thawed.  Anything I should be aware of in the KTR_PROC ktrlog?

Thanks,
sephe

-- 
Tomorrow Will Never Die



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAMOc5cyNnDc2aaB3M0be_4a6a5VmL-k=drTqBFyo2qU2gC3qqw>