From owner-freebsd-current@freebsd.org Tue Jan 17 18:04:51 2017 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 31906CB45BC for ; Tue, 17 Jan 2017 18:04:51 +0000 (UTC) (envelope-from hps@selasky.org) Received: from mail.turbocat.net (turbocat.net [88.99.82.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D6E071860; Tue, 17 Jan 2017 18:04:50 +0000 (UTC) (envelope-from hps@selasky.org) Received: from hps2016.home.selasky.org (unknown [62.141.129.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id BEF311FE025; Tue, 17 Jan 2017 19:04:32 +0100 (CET) Subject: Re: Strange issue after early AP startup To: John Baldwin References: <3466870.vFI8cjabAc@ralph.baldwin.cx> <264c5dd8-d82d-3326-7de5-96b0bb96ac01@selasky.org> <2215603.KuBd8pM5Pm@ralph.baldwin.cx> Cc: FreeBSD Current , Konstantin Belousov From: Hans Petter Selasky Message-ID: <3cbe6454-82cc-0592-4ee6-3c1552b19f9a@selasky.org> Date: Tue, 17 Jan 2017 19:04:15 +0100 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <2215603.KuBd8pM5Pm@ralph.baldwin.cx> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Jan 2017 18:04:51 -0000 On 01/17/17 16:50, John Baldwin wrote: > On Monday, January 16, 2017 10:10:16 PM Hans Petter Selasky wrote: >> On 01/16/17 20:31, John Baldwin wrote: >>> On Monday, January 16, 2017 04:51:42 PM Hans Petter Selasky wrote: >>>> Hi, >>>> >>>> When booting I observe an additional 30-second delay after this print: >>>> >>>>> Timecounters tick every 1.000 msec >>>> >>>> ~30 second delay and boot continues like normal. >>>> >>>> Checking "vmstat -i" reveals that some timers have been running loose. >>>> >>>>> cpu0:timer 44300 442 >>>>> cpu1:timer 40561 404 >>>>> cpu3:timer 48462822 483058 >>>>> cpu2:timer 48477898 483209 >>>> >>>> Trying to add delays and/or prints around the Timecounters printout >>>> makes the issue go away. Any ideas for debugging? >>> >>> I have generally used KTR tracing to trace what is happening during >>> boot to debug EARLY_AP_STARTUP issues. >>> >> >> Hi John, >> >> What happens is that getnextcpuevent(0) keeps on returning >> "state->nextcall" which is in the past for CPU #2 and #3 on my box. >> >> In "cpu_new_callout()" there is a check if "bt >= state->nextcall", >> which I suspect is true, so "state->nextcall" never gets set to real >> minimum sbintime. >> >> The attached patch fixes the problem for me, but I'm not 100% sure if it >> is correct. > Hi, > I think we want to be honoring any currently scheduled callouts. The problem here is that we might be changing the clocksource, then sbinuptime() will change too, so I think the value should be reset by configtimer() and then corrected at the next call to callout_process(). > You could > do that by setting it to 'cc_firstevent' of the associated CPU, but in > practice 'state->nextcall' should already be set to that (it is initalized > to SBT_MAX in cpu_initclocks_bsp() and is then only set to other values due > to cpu_new_callout()). Keep in mind that configtimer() is not just called > from boot, but is also invoked when starting/stopping the profiling timer. > > However, when setting 'nextevent' (which is used to schedule the next timer > interrupt), we should be honoring the existing 'nextcall' if it is sooner > than the next hardclock. Does this matter for the first tick? How often is configtimer() called? > (One odd thing is that even in your case the first call to handleevents(), > the 'now => state->nextcallout' check in handleevents() should be true > which resets both nextcall and nextcallopt and invokes callout_process().) Let me take you through the failure path, by code inspection: 1) configtimer() is called and we init nextcall and nextcallopt: > next = now + timerperiod; ... > state->nextcall = next; > state->nextcallopt = next; 2) Any callout_reset() calls cpu_new_callout(): > */ > state->nextcallopt = bt_opt; > if (bt >= state->nextcall) We follow this path, because "bt" is surely based on sbinuptime() and is greater or equal to state->nextcall. Note that state->nextcallopt is updated to bt_opt, which is in the future. > goto done; > state->nextcall = bt; 3) getnextcpuevent(0) is called by the fast timercb() to setup the next event: > state = DPCPU_PTR(timerstate); > /* Handle hardclock() events, skipping some if CPU is idle. */ > event = state->nexthard; ... > /* Handle callout events. */ > if (event > state->nextcall) We then go looping into this path, because state->nextcall is still equal to "next" as in step 1) which is now in the past, until "now >= state->nextcallopt" inside handleevents(), which clears this condition. > event = state->nextcall; ... > return (event); --HPS