From owner-freebsd-current@freebsd.org Mon Jan 16 21:10:52 2017 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7E5FFCB33C0 for ; Mon, 16 Jan 2017 21:10:52 +0000 (UTC) (envelope-from hps@selasky.org) Received: from mail.turbocat.net (turbocat.net [IPv6:2a01:4f8:c17:6c4b::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4731E1036; Mon, 16 Jan 2017 21:10:52 +0000 (UTC) (envelope-from hps@selasky.org) Received: from hps2016.home.selasky.org (unknown [62.141.129.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id 014E81FE025; Mon, 16 Jan 2017 22:10:34 +0100 (CET) Subject: Re: Strange issue after early AP startup To: John Baldwin References: <3466870.vFI8cjabAc@ralph.baldwin.cx> Cc: FreeBSD Current , Konstantin Belousov From: Hans Petter Selasky Message-ID: <264c5dd8-d82d-3326-7de5-96b0bb96ac01@selasky.org> Date: Mon, 16 Jan 2017 22:10:16 +0100 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <3466870.vFI8cjabAc@ralph.baldwin.cx> Content-Type: multipart/mixed; boundary="------------61B84EA364FFEDA0AB60FEEC" X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Jan 2017 21:10:52 -0000 This is a multi-part message in MIME format. --------------61B84EA364FFEDA0AB60FEEC Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit On 01/16/17 20:31, John Baldwin wrote: > On Monday, January 16, 2017 04:51:42 PM Hans Petter Selasky wrote: >> Hi, >> >> When booting I observe an additional 30-second delay after this print: >> >>> Timecounters tick every 1.000 msec >> >> ~30 second delay and boot continues like normal. >> >> Checking "vmstat -i" reveals that some timers have been running loose. >> >>> cpu0:timer 44300 442 >>> cpu1:timer 40561 404 >>> cpu3:timer 48462822 483058 >>> cpu2:timer 48477898 483209 >> >> Trying to add delays and/or prints around the Timecounters printout >> makes the issue go away. Any ideas for debugging? > > I have generally used KTR tracing to trace what is happening during > boot to debug EARLY_AP_STARTUP issues. > Hi John, What happens is that getnextcpuevent(0) keeps on returning "state->nextcall" which is in the past for CPU #2 and #3 on my box. In "cpu_new_callout()" there is a check if "bt >= state->nextcall", which I suspect is true, so "state->nextcall" never gets set to real minimum sbintime. The attached patch fixes the problem for me, but I'm not 100% sure if it is correct. --HPS --------------61B84EA364FFEDA0AB60FEEC Content-Type: text/x-patch; name="timer_init_fix.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="timer_init_fix.diff" diff --git a/sys/kern/kern_clocksource.c b/sys/kern/kern_clocksource.c index 7f7769d..454a130 100644 --- a/sys/kern/kern_clocksource.c +++ b/sys/kern/kern_clocksource.c @@ -511,7 +511,7 @@ configtimer(int start) state->nexthard = next; state->nextstat = next; state->nextprof = next; - state->nextcall = next; + state->nextcall = SBT_MAX; state->nextcallopt = next; hardclock_sync(cpu); } --------------61B84EA364FFEDA0AB60FEEC--