From owner-freebsd-current@freebsd.org Tue Jan 17 17:32:41 2017 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2410ECB407C for ; Tue, 17 Jan 2017 17:32:41 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from mail.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id F40DB1568; Tue, 17 Jan 2017 17:32:40 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net [73.231.226.104]) by mail.baldwin.cx (Postfix) with ESMTPSA id 0901010A7B9; Tue, 17 Jan 2017 12:32:40 -0500 (EST) From: John Baldwin To: Hans Petter Selasky Cc: FreeBSD Current , Konstantin Belousov Subject: Re: Strange issue after early AP startup Date: Tue, 17 Jan 2017 07:50:31 -0800 Message-ID: <2215603.KuBd8pM5Pm@ralph.baldwin.cx> User-Agent: KMail/4.14.10 (FreeBSD/11.0-STABLE; KDE/4.14.10; amd64; ; ) In-Reply-To: <264c5dd8-d82d-3326-7de5-96b0bb96ac01@selasky.org> References: <3466870.vFI8cjabAc@ralph.baldwin.cx> <264c5dd8-d82d-3326-7de5-96b0bb96ac01@selasky.org> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (mail.baldwin.cx); Tue, 17 Jan 2017 12:32:40 -0500 (EST) X-Virus-Scanned: clamav-milter 0.99.2 at mail.baldwin.cx X-Virus-Status: Clean X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Jan 2017 17:32:41 -0000 On Monday, January 16, 2017 10:10:16 PM Hans Petter Selasky wrote: > On 01/16/17 20:31, John Baldwin wrote: > > On Monday, January 16, 2017 04:51:42 PM Hans Petter Selasky wrote: > >> Hi, > >> > >> When booting I observe an additional 30-second delay after this print: > >> > >>> Timecounters tick every 1.000 msec > >> > >> ~30 second delay and boot continues like normal. > >> > >> Checking "vmstat -i" reveals that some timers have been running loose. > >> > >>> cpu0:timer 44300 442 > >>> cpu1:timer 40561 404 > >>> cpu3:timer 48462822 483058 > >>> cpu2:timer 48477898 483209 > >> > >> Trying to add delays and/or prints around the Timecounters printout > >> makes the issue go away. Any ideas for debugging? > > > > I have generally used KTR tracing to trace what is happening during > > boot to debug EARLY_AP_STARTUP issues. > > > > Hi John, > > What happens is that getnextcpuevent(0) keeps on returning > "state->nextcall" which is in the past for CPU #2 and #3 on my box. > > In "cpu_new_callout()" there is a check if "bt >= state->nextcall", > which I suspect is true, so "state->nextcall" never gets set to real > minimum sbintime. > > The attached patch fixes the problem for me, but I'm not 100% sure if it > is correct. I think we want to be honoring any currently scheduled callouts. You could do that by setting it to 'cc_firstevent' of the associated CPU, but in practice 'state->nextcall' should already be set to that (it is initalized to SBT_MAX in cpu_initclocks_bsp() and is then only set to other values due to cpu_new_callout()). Keep in mind that configtimer() is not just called from boot, but is also invoked when starting/stopping the profiling timer. However, when setting 'nextevent' (which is used to schedule the next timer interrupt), we should be honoring the existing 'nextcall' if it is sooner than the next hardclock. (One odd thing is that even in your case the first call to handleevents(), the 'now => state->nextcallout' check in handleevents() should be true which resets both nextcall and nextcallopt and invokes callout_process().) Here is a suggestion that attempts what I described in the first paragraph. If you still get hangs it would be good to break into DDB and capture the output of 'show clocksource'. Index: kern_clocksource.c =================================================================== --- kern_clocksource.c (revision 312301) +++ kern_clocksource.c (working copy) @@ -503,7 +503,12 @@ configtimer(int start) state->nextevent = SBT_MAX; else #endif + if (next < state->nextcall) state->nextevent = next; + else if (state->nextcall < now) + state->nextevent = now; + else + state->nextevent = state->nextcall; if (periodic) state->nexttick = next; else @@ -511,8 +516,6 @@ configtimer(int start) state->nexthard = next; state->nextstat = next; state->nextprof = next; - state->nextcall = next; - state->nextcallopt = next; hardclock_sync(cpu); } busy = 0; -- John Baldwin