From owner-freebsd-current@freebsd.org  Tue Jan 17 18:04:51 2017
Return-Path: <owner-freebsd-current@freebsd.org>
Delivered-To: freebsd-current@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 31906CB45BC
 for <freebsd-current@mailman.ysv.freebsd.org>;
 Tue, 17 Jan 2017 18:04:51 +0000 (UTC) (envelope-from hps@selasky.org)
Received: from mail.turbocat.net (turbocat.net [88.99.82.50])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id D6E071860;
 Tue, 17 Jan 2017 18:04:50 +0000 (UTC) (envelope-from hps@selasky.org)
Received: from hps2016.home.selasky.org (unknown [62.141.129.119])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by mail.turbocat.net (Postfix) with ESMTPSA id BEF311FE025;
 Tue, 17 Jan 2017 19:04:32 +0100 (CET)
Subject: Re: Strange issue after early AP startup
To: John Baldwin <jhb@freebsd.org>
References: <b9c53237-4b1a-a140-f692-bf5837060b18@selasky.org>
 <3466870.vFI8cjabAc@ralph.baldwin.cx>
 <264c5dd8-d82d-3326-7de5-96b0bb96ac01@selasky.org>
 <2215603.KuBd8pM5Pm@ralph.baldwin.cx>
Cc: FreeBSD Current <freebsd-current@freebsd.org>,
 Konstantin Belousov <kib@freebsd.org>
From: Hans Petter Selasky <hps@selasky.org>
Message-ID: <3cbe6454-82cc-0592-4ee6-3c1552b19f9a@selasky.org>
Date: Tue, 17 Jan 2017 19:04:15 +0100
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101
 Thunderbird/45.4.0
MIME-Version: 1.0
In-Reply-To: <2215603.KuBd8pM5Pm@ralph.baldwin.cx>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
 <freebsd-current.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current/>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 17 Jan 2017 18:04:51 -0000

On 01/17/17 16:50, John Baldwin wrote:
 > On Monday, January 16, 2017 10:10:16 PM Hans Petter Selasky wrote:
 >> On 01/16/17 20:31, John Baldwin wrote:
 >>> On Monday, January 16, 2017 04:51:42 PM Hans Petter Selasky wrote:
 >>>> Hi,
 >>>>
 >>>> When booting I observe an additional 30-second delay after this print:
 >>>>
 >>>>> Timecounters tick every 1.000 msec
 >>>>
 >>>> ~30 second delay and boot continues like normal.
 >>>>
 >>>> Checking "vmstat -i" reveals that some timers have been running loose.
 >>>>
 >>>>> cpu0:timer                         44300        442
 >>>>> cpu1:timer                         40561        404
 >>>>> cpu3:timer                      48462822     483058
 >>>>> cpu2:timer                      48477898     483209
 >>>>
 >>>> Trying to add delays and/or prints around the Timecounters printout
 >>>> makes the issue go away. Any ideas for debugging?
 >>>
 >>> I have generally used KTR tracing to trace what is happening during
 >>> boot to debug EARLY_AP_STARTUP issues.
 >>>
 >>
 >> Hi John,
 >>
 >> What happens is that getnextcpuevent(0) keeps on returning
 >> "state->nextcall" which is in the past for CPU #2 and #3 on my box.
 >>
 >> In "cpu_new_callout()" there is a check if "bt >= state->nextcall",
 >> which I suspect is true, so "state->nextcall" never gets set to real
 >> minimum sbintime.
 >>
 >> The attached patch fixes the problem for me, but I'm not 100% sure 
if it
 >> is correct.
 >

Hi,

 > I think we want to be honoring any currently scheduled callouts.

The problem here is that we might be changing the clocksource, then 
sbinuptime() will change too, so I think the value should be reset by 
configtimer() and then corrected at the next call to callout_process().

 > You could
 > do that by setting it to 'cc_firstevent' of the associated CPU, but in
 > practice 'state->nextcall' should already be set to that (it is 
initalized
 > to SBT_MAX in cpu_initclocks_bsp() and is then only set to other 
values due
 > to cpu_new_callout()).  Keep in mind that configtimer() is not just 
called
 > from boot, but is also invoked when starting/stopping the profiling 
timer.
 >

 > However, when setting 'nextevent' (which is used to schedule the next 
timer
 > interrupt), we should be honoring the existing 'nextcall' if it is sooner
 > than the next hardclock.

Does this matter for the first tick? How often is configtimer() called?

 > (One odd thing is that even in your case the first call to 
handleevents(),
 > the 'now => state->nextcallout' check in handleevents() should be true
 > which resets both nextcall and nextcallopt and invokes 
callout_process().)

Let me take you through the failure path, by code inspection:

1) configtimer() is called and we init nextcall and nextcallopt:

 >                 next = now + timerperiod;
...
 >                       state->nextcall = next;
 >                       state->nextcallopt = next;

2) Any callout_reset() calls cpu_new_callout():

 >          */
 >         state->nextcallopt = bt_opt;
 >         if (bt >= state->nextcall)
We follow this path, because "bt" is surely based on sbinuptime() and is 
greater or equal to state->nextcall. Note that state->nextcallopt is 
updated to bt_opt, which is in the future.
 >                 goto done;
 >         state->nextcall = bt;

3) getnextcpuevent(0) is called by the fast timercb() to setup the next 
event:

 >         state = DPCPU_PTR(timerstate);
 >         /* Handle hardclock() events, skipping some if CPU is idle. */
 >         event = state->nexthard;
...
 >         /* Handle callout events. */
 >         if (event > state->nextcall)
We then go looping into this path, because state->nextcall is still 
equal to "next" as in step 1) which is now in the past, until "now >= 
state->nextcallopt" inside handleevents(), which clears this condition.
 >                 event = state->nextcall;
...
 >         return (event);

--HPS