From owner-freebsd-current@freebsd.org Tue Jan 17 21:39:02 2017 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 92011CB5A0D for ; Tue, 17 Jan 2017 21:39:02 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from mail.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4F23A19FB; Tue, 17 Jan 2017 21:39:02 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net [73.231.226.104]) by mail.baldwin.cx (Postfix) with ESMTPSA id 79E6710A7B9; Tue, 17 Jan 2017 16:39:00 -0500 (EST) From: John Baldwin To: Hans Petter Selasky Cc: FreeBSD Current , Konstantin Belousov Subject: Re: Strange issue after early AP startup Date: Tue, 17 Jan 2017 13:38:54 -0800 Message-ID: <2180052.M81hFKGVpQ@ralph.baldwin.cx> User-Agent: KMail/4.14.10 (FreeBSD/11.0-STABLE; KDE/4.14.10; amd64; ; ) In-Reply-To: References: <4212167.Wq8tLU1ohU@ralph.baldwin.cx> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (mail.baldwin.cx); Tue, 17 Jan 2017 16:39:00 -0500 (EST) X-Virus-Scanned: clamav-milter 0.99.2 at mail.baldwin.cx X-Virus-Status: Clean X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Jan 2017 21:39:02 -0000 On Tuesday, January 17, 2017 08:31:28 PM Hans Petter Selasky wrote: > Hi, > > On 01/17/17 20:00, John Baldwin wrote: > >> > >> Does this matter for the first tick? How often is configtimer() called? > > > > As I said, it is called at runtime when profclock is started / stopped, not > > just at boot. Those changes at runtime probably have existing callouts > > active and your change will not process any callouts until the next hardclock > > tick fires (but only because you are setting nextcallopt to the bogus > > 'next' value). > > > > >> > (One odd thing is that even in your case the first call to > >> handleevents(), > >> > the 'now => state->nextcallout' check in handleevents() should be true > >> > which resets both nextcall and nextcallopt and invokes > >> callout_process().) > >> > >> Let me take you through the failure path, by code inspection: > > > > I would really appreciate it if you could add traces to find out what actually > > happens rather than what seems to happen by looking at the code. :-/ > > The problem is that once you add some prints, the problem goes away. > Maybe I should try to set hz to 100 or 25 ??? Maybe use KTR instead and then have a kdb_enter() later in boot so you can use 'show ktr' in DDB. (KTR has less overhead than printfs so might not disrupt the timing as badly.) > > > > 0) cpu_initclocks_bsp() is called and init's nextcall and nexcallopt to SBT_MAX > > similar to your change. If no callout is scheduled before configtimer() > > then they remain set to SBT_MAX. Your current patch happens to trigger a > > (bogus) call to callout_process() on the first hardclock() because it > > sets nextcallopt to 'next' even though no callout is actually scheduled to > > fire at time 'next'. > > > >> 1) configtimer() is called and we init nextcall and nextcallopt: > >> > >> > next = now + timerperiod; > >> ... > >> > state->nextcall = next; > >> > state->nextcallopt = next; > > > > These both say "the next callout() should fire at 'next' which is the time of > > the next hardclock()", even though there may be no callouts scheduled (in which > > case both of these fields should be set to SBT_MAX from the call to > > cpu_initclocks_bsp(), or there may be callouts scheduled in which case 'nextcall' > > and 'nextcallopt' will reflect the time that those callouts are already > > scheduled for and this overwrites that). > > I see there are some callouts scheduled by SYSINITs, before the first > configtimer(), like NFS_TIMERINIT in nfs_init(). These are setup using > "dummy_timecounter" which means any nextcall values before the first > configtimer should be discarded. Hmm, I actually tested early callouts by having callouts scheduled for 1, 4, and 8 seconds right after callouts were initialized and they worked correctly (albeit using lapic timer as the eventtimer and TSC as the timecounter). By the time they were called though, sbinuptime() was not returning dummy values, but real ones (probably because TSC gets added as a timecounter in SI_SUB_CPU). The patch I used for testing is still in my work branch here: https://github.com/freebsd/freebsd/compare/master...bsdjhb:early_callout So in at least some cases you don't have to discard nextcall during boot. However, if TSC isn't available you might not get a timecounter until later in boot during device probe in which case you would get dummy timecounter, and then the nextcall/nextcallopt values aren't great. Hmmm, I wonder if just bumping nextcall to be 'now' in case it is less than 'now' would be sufficient. I think my previous patch still looped even though it might have set 'next_event' correctly because the 'nextcall' value was still too small. That is: Index: kern_clocksource.c =================================================================== --- kern_clocksource.c (revision 312301) +++ kern_clocksource.c (working copy) @@ -498,12 +498,18 @@ configtimer(int start) CPU_FOREACH(cpu) { state = DPCPU_ID_PTR(cpu, timerstate); state->now = now; + printf("%s: CPU %d: now %jd nextcall %jd nextcallopt %jd next %jd\n", __func__, cpu, state->nextcall, state->nextcall, next); + if (state->nextcall < now) + state->nextcall = now; #ifndef EARLY_AP_STARTUP if (!smp_started && cpu != CPU_FIRST()) state->nextevent = SBT_MAX; else #endif + if (next < state->nextcall) state->nextevent = next; + else + state->nextevent = state->nextcall; if (periodic) state->nexttick = next; else @@ -511,8 +517,6 @@ configtimer(int start) state->nexthard = next; state->nextstat = next; state->nextprof = next; - state->nextcall = next; - state->nextcallopt = next; hardclock_sync(cpu); } busy = 0; -- John Baldwin