From owner-freebsd-current@freebsd.org Tue Jan 17 19:32:04 2017 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 563A8CB4529 for ; Tue, 17 Jan 2017 19:32:04 +0000 (UTC) (envelope-from hps@selasky.org) Received: from mail.turbocat.net (turbocat.net [IPv6:2a01:4f8:c17:6c4b::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 07A7A109E; Tue, 17 Jan 2017 19:32:04 +0000 (UTC) (envelope-from hps@selasky.org) Received: from hps2016.home.selasky.org (unknown [62.141.129.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id 4D9FB1FE025; Tue, 17 Jan 2017 20:31:46 +0100 (CET) Subject: Re: Strange issue after early AP startup To: John Baldwin References: <2215603.KuBd8pM5Pm@ralph.baldwin.cx> <3cbe6454-82cc-0592-4ee6-3c1552b19f9a@selasky.org> <4212167.Wq8tLU1ohU@ralph.baldwin.cx> Cc: FreeBSD Current , Konstantin Belousov From: Hans Petter Selasky Message-ID: Date: Tue, 17 Jan 2017 20:31:28 +0100 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <4212167.Wq8tLU1ohU@ralph.baldwin.cx> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Jan 2017 19:32:04 -0000 Hi, On 01/17/17 20:00, John Baldwin wrote: >> >> Does this matter for the first tick? How often is configtimer() called? > > As I said, it is called at runtime when profclock is started / stopped, not > just at boot. Those changes at runtime probably have existing callouts > active and your change will not process any callouts until the next hardclock > tick fires (but only because you are setting nextcallopt to the bogus > 'next' value). >> > (One odd thing is that even in your case the first call to >> handleevents(), >> > the 'now => state->nextcallout' check in handleevents() should be true >> > which resets both nextcall and nextcallopt and invokes >> callout_process().) >> >> Let me take you through the failure path, by code inspection: > > I would really appreciate it if you could add traces to find out what actually > happens rather than what seems to happen by looking at the code. :-/ The problem is that once you add some prints, the problem goes away. Maybe I should try to set hz to 100 or 25 ??? > > 0) cpu_initclocks_bsp() is called and init's nextcall and nexcallopt to SBT_MAX > similar to your change. If no callout is scheduled before configtimer() > then they remain set to SBT_MAX. Your current patch happens to trigger a > (bogus) call to callout_process() on the first hardclock() because it > sets nextcallopt to 'next' even though no callout is actually scheduled to > fire at time 'next'. > >> 1) configtimer() is called and we init nextcall and nextcallopt: >> >> > next = now + timerperiod; >> ... >> > state->nextcall = next; >> > state->nextcallopt = next; > > These both say "the next callout() should fire at 'next' which is the time of > the next hardclock()", even though there may be no callouts scheduled (in which > case both of these fields should be set to SBT_MAX from the call to > cpu_initclocks_bsp(), or there may be callouts scheduled in which case 'nextcall' > and 'nextcallopt' will reflect the time that those callouts are already > scheduled for and this overwrites that). I see there are some callouts scheduled by SYSINITs, before the first configtimer(), like NFS_TIMERINIT in nfs_init(). These are setup using "dummy_timecounter" which means any nextcall values before the first configtimer should be discarded. > >> 2) Any callout_reset() calls cpu_new_callout(): >> >> > */ >> > state->nextcallopt = bt_opt; >> > if (bt >= state->nextcall) >> We follow this path, because "bt" is surely based on sbinuptime() and is >> greater or equal to state->nextcall. Note that state->nextcallopt is >> updated to bt_opt, which is in the future. > > Note, my patch should _leave_ nextcall at SBT_MAX (from cpu_initclocks_bsp()) > unless there was already an earlier call to callout_reset(). Yes, there are calls to callout_reset(). See for example NFS_TIMERINIT, like mentioned above. > IOW, it > should be a NOP for the purposes of this branch compared with your change. > (You could add a warning to print out if 'nextcall' != SBT_MAX during boot > and see if it fires for example.) > >> > goto done; >> > state->nextcall = bt; >> >> 3) getnextcpuevent(0) is called by the fast timercb() to setup the next >> event: >> >> > state = DPCPU_PTR(timerstate); >> > /* Handle hardclock() events, skipping some if CPU is idle. */ >> > event = state->nexthard; >> ... >> > /* Handle callout events. */ >> > if (event > state->nextcall) >> We then go looping into this path, because state->nextcall is still >> equal to "next" as in step 1) which is now in the past, until "now >= >> state->nextcallopt" inside handleevents(), which clears this condition. >> > event = state->nextcall; >> ... >> > return (event); > > I'm curious if there is a callout_reset() that has set 'nextcall' to a time > that is effectively before 'now'. > > Maybe add a printf like this: > > Index: kern_clocksource.c > =================================================================== > --- kern_clocksource.c (revision 312301) > +++ kern_clocksource.c (working copy) > @@ -498,12 +498,18 @@ configtimer(int start) > CPU_FOREACH(cpu) { > state = DPCPU_ID_PTR(cpu, timerstate); > state->now = now; > + printf("%s: CPU %d: now %jd nextcall %jd nextcallopt %jd next %jd\n", __func__, cpu, state->nextcall, state->nextcall, next); > #ifndef EARLY_AP_STARTUP > if (!smp_started && cpu != CPU_FIRST()) > state->nextevent = SBT_MAX; > else > #endif > > In particular what I am worried about with your patch is that for > post-boot calls to configtimer() you will delay any previously-scheduled > callouts until the next hardclock. > I understand. Would a solution be to refactor callout_process(), to accept the PCPU_GET(CPUID) as an argument and be executed for all CPUs by configtimer(), instead of trying to guess state->nextcall and state->nextcallopt in configtimer() ? --HPS