From owner-svn-src-all@freebsd.org Sat Jan 13 12:28:29 2018 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 45065E61EFE; Sat, 13 Jan 2018 12:28:29 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail105.syd.optusnet.com.au (mail105.syd.optusnet.com.au [211.29.132.249]) by mx1.freebsd.org (Postfix) with ESMTP id C2E67838B3; Sat, 13 Jan 2018 12:28:28 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail105.syd.optusnet.com.au (Postfix) with ESMTPS id DCBF4104B035; Sat, 13 Jan 2018 22:56:29 +1100 (AEDT) Date: Sat, 13 Jan 2018 22:56:29 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Warner Losh cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r327767 - in head/sys: conf i386/bios i386/conf i386/isa In-Reply-To: <201801101459.w0AExJWM055025@repo.freebsd.org> Message-ID: <20180113203848.L2336@besplex.bde.org> References: <201801101459.w0AExJWM055025@repo.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=YbvN30Zf c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=kWmvSgFVWL--hNsEXKIA:9 a=FGfX17d0zrhrSsFC:21 a=CjuIK1q_8ugA:10 X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Jan 2018 12:28:29 -0000 On Wed, 10 Jan 2018, Warner Losh wrote: > Log: > Retire pmtimer driver. Move time fixing into apm driver. Move > Iwasaki-san's copyright over. Remove FIXME code that couldn't possibly > work. Call tc_settime() with our estimate of the delta we've been > alseep (the one we print) to adjust the time. Not sure what to do > about callouts, so keep the small #ifdef in place there. The FIXME code might have worked before timecounters, but never worked in any committed version. You seem to have updated it correctly for timecounters in this commit, but then broke it by removing it in a later commit (see another reply). This commit obfuscates the fix by combining moving of the file with changing it. svn is also very broken -- it doesn't show history of the file under its old name. This is more broken than usual -- since the file was merged into a different file, svn doesn't show history of the file under its new name either. The callout fixup code is more broken. It never compiled in any committed version, since it uses data structures and logic that went away before it was committed. The option for it (APM_FIXUP_CALLTODO) is so broken that it is not a supported option or in LINT, so it is not normally noticed that the code under it never compiled. Long callouts are especially broken. E.g., nanosleep(1 hour) at least used to sleep for 2 hours if the system is suspended for 1 hour during the sleep. I haven't noticed any fixes for this. Callouts mostly use relative timeouts, and relative timeouts just don't work across suspensions. Actually, I have noticed some fixes for nanosleep(). Another bug in it was that it is specified to sleep in the time specified by the clock id CLOCK_REALTIME, but used to sleep in the time specified by the clock id CLOCK_MONOTONIC (the latter should be as close as possible to the former except for leap seconds adjustments, but is quite broken -- it doesn't count suspended time, and non-leap seconds settings of CLOCK_REALTIME also make CLOCK_MONOTONIC useless for determining physical time differences. nanosleep() now sleeps on the correct clock id, and clock_nanosleep() is now implemented. This is complicated, and I haven't checked if the fix works. It looks to simple to work. To work, it needs to do things like adjust timers whenever the timecounter jumps (mainly for resume and leap seconds adjustments). APM_FIXUP_CALLTODO only tries to handle the easier case of resume. > Modified: head/sys/conf/files.i386 > ============================================================================== > --- head/sys/conf/files.i386 Wed Jan 10 14:58:58 2018 (r327766) > +++ head/sys/conf/files.i386 Wed Jan 10 14:59:19 2018 (r327767) > @@ -520,7 +520,6 @@ i386/ibcs2/ibcs2_util.c optional ibcs2 > i386/ibcs2/ibcs2_xenix.c optional ibcs2 > i386/ibcs2/ibcs2_xenix_sysent.c optional ibcs2 > i386/ibcs2/imgact_coff.c optional ibcs2 > -i386/isa/pmtimer.c optional pmtimer > i386/isa/prof_machdep.c optional profiling-routine > i386/linux/imgact_linux.c optional compat_linux > i386/linux/linux_dummy.c optional compat_linux pmtimer used to be needed all systems with acpi except amd64, because the acpi fixup of timecounters was bogusly ifdefed for amd64 only (or was not done by acpi in very old versions?). The pmtimer code for this fixup is still better tha the acpi code, except the FIXME made it unused and the next commit dumbs it down to worse than the acpi code (the acpi code is at least simpler). > Modified: head/sys/i386/bios/apm.c > ============================================================================== > --- head/sys/i386/bios/apm.c Wed Jan 10 14:58:58 2018 (r327766) > +++ head/sys/i386/bios/apm.c Wed Jan 10 14:59:19 2018 (r327767) > Modified: head/sys/i386/bios/apm.c > ============================================================================== > --- head/sys/i386/bios/apm.c Wed Jan 10 14:58:58 2018 (r327766) > +++ head/sys/i386/bios/apm.c Wed Jan 10 14:59:19 2018 (r327767) > @@ -205,7 +232,7 @@ static int > apm_driver_version(int version) > { > struct apm_softc *sc = &apm_softc; > - > + > sc->bios.r.eax = (APM_BIOS << 8) | APM_DRVVERSION; > sc->bios.r.ebx = 0x0; > sc->bios.r.ecx = version; Unrelated style fixes make the important changes harder to see. This patch was to remove trailing whitespace, but some mail program broke it by removing the whitespace in the source lines. [... further mangled patches for whitespace not shown] > @@ -331,12 +358,10 @@ apm_battery_low(void) > static struct apmhook * > apm_add_hook(struct apmhook **list, struct apmhook *ah) > { > - int s; > struct apmhook *p, *prev; > > APM_DPRINT("Add hook \"%s\"\n", ah->ah_name); > > - s = splhigh(); > if (ah == NULL) > panic("illegal apm_hook!"); > prev = NULL; > @@ -351,30 +376,25 @@ apm_add_hook(struct apmhook **list, struct apmhook *ah > ah->ah_next = prev->ah_next; > prev->ah_next = ah; > } > - splx(s); > return ah; > } This commit also removes spl*() with no mention of this in the log message. In FreeBSD-4 where spl's were not null, splhigh() was probably wrong, but it was much better than the splsoftclock() used in pmtimer.c. Hard clock interrupts should be disabled, and my fixes for this change softclock to statclock + XXX. It happens that splstatclock() == splclock() == splhigh(), so these splhigh()s were good enough. Now the locking is obscure. I think it is just Giant, and that is not enough. The spl's should not be removed until the locking is done properly. The splsoftclock()'s were removed even earlier in pmtimer.c, without doing any proper locking of course. > static void > apm_del_hook(struct apmhook **list, struct apmhook *ah) > { > - int s; > struct apmhook *p, *prev; > > - s = splhigh(); > prev = NULL; > for (p = *list; p != NULL; prev = p, p = p->ah_next) > if (p == ah) > goto deleteit; > panic("Tried to delete unregistered apm_hook."); > - goto nosuchnode; > + return; > deleteit: > if (prev != NULL) > prev->ah_next = p->ah_next; > else > *list = p->ah_next; > -nosuchnode: > - splx(s); > } Locking for device removal is especially necessary and badly done. New-bus forces lots of Giant locking which is probably not enough. > @@ -1047,6 +1067,53 @@ apm_processevent(void) > } while (apm_event != PMEV_NOEVENT); > } > > +static struct timeval suspend_time; > +static struct timeval diff_time; > + > +static int > +apm_rtc_suspend(void *arg __unused) > +{ > + > + microtime(&diff_time); > + inittodr(0); > + microtime(&suspend_time); > + timevalsub(&diff_time, &suspend_time); > + return (0); > +} This calculates the difference for use later, using a hackish method: - this used to be bogusly locked by - microtime(&diff_time) gives the current time in a good way - start hacking with inittodr(0). inittodr() reads the time from the RTC and adds the nasty global timezone offset to it. Without the latter, the new time set by inittodr() would be off by up to 12 hours. With the latter, it might still be off by 1 hour for a DST adjustment, depending on races between this and settimeofday(2) and adjkerntz(8) and certain racy sysctls, and it it is usually off by many seconds due to different drift of the RTC and the current time. The purpose of this function is to determine the drift at suspend time so as to add it back at resume time. This also handles timezone offsets even if inittodr() doesn't add them, and might reduce races for adjusting the timezone offset for DST. > + > +static int > +apm_rtc_resume(void *arg __unused) > +{ > + u_int second, minute, hour; > + struct timeval resume_time, tmp_time; > + struct timespec ts; > + > + /* modified for adjkerntz */ Cryptic comment from old code. This is related to the timezone offset, but I don't see how adjkerntz benefits in practice from old modifications made here. It only obviously benefits in theory from smaller races. > + timer_restore(); /* restore the all timers */ Old banal comment. timer_restore() function only exists on i386, and only restores 2 isa timers (i8254 and rtc), so moving its code out of isa is a layering violation. > + inittodr(0); /* adjust time to RTC */ Banal comment. Also wrong when the RTC time is local. > + microtime(&resume_time); > + getmicrotime(&tmp_time); The previous line is bad old code. We want to do arithmetic using resume_time and should copy it to tmp_time. Instead, we use the getmicrotime() mistake to fetch a truncated copy of a time slightly later than resume_time. > + timevaladd(&tmp_time, &diff_time); diff_time is the difference at suspend time and this adds it to resume_time (modulo the truncation). This gives correct-as-possible resume_time. > + /* Calculate the delta time suspended */ > + timevalsub(&resume_time, &suspend_time); > + > + second = ts.tv_sec = resume_time.tv_sec; > + ts.tv_nsec = 0; > + tc_setclock(&ts); This used to be better, except tc_setclock() was 'time = tmp_time' under FIXME. The calculation of the time suspended is convolved with fixing up the timecounter, and there is a comment about the relatively simple calculation of the time suspended and none about the convulutions or the more delicate fixup. > + > +#ifdef PMTIMER_FIXUP_CALLTODO > + /* Fixup the calltodo list with the delta time. */ > + adjust_timeout_calltodo(&resume_time); > +#endif /* PMTIMER_FIXUP_CALLTODO */ This has several style bugs together with its option being unusable. It further abuses resume_time. resume_time was initially actually the resume time before the fixup. The fixed up resume time is now in ts. We calculated this using tmp_time to keep the initial resume_time alive for using in another caclulation. Then the other calculation abuses resume_time for the delta time needed by the uncallable fixup. The comment unobfuscates this a but by describing the abused resume_time as the delta time. > + hour = second / 3600; > + second %= 3600; > + minute = second / 60; > + second %= 60; > + log(LOG_NOTICE, "wakeup from sleeping state (slept %02d:%02d:%02d)\n", > + hour, minute, second); > + return (0); With the FIXME version, almost everything was unused except for printing this msssage. This message is useful for debugging, but acpi never bothered doing it. > +} In the version with the FIXME, the careful fixups done by pmtimer reduce to just inittodr(0) plus printing the delta. In this version, the timecounter fixup actually works so the careful setup for this is undone, but things are convoluted with the unreachale callout fixup and printing the delta. acpi just does 2 timecounter warmups followed by inittodr(time_second + scp->acpi_sleep_delay). Here the arg is differently wrong. An arg of 0 says that we have no idea of the time and the raw RTC time should be used. time_second is the time in seconds before the system is fully resumed. This is probably the time in seconds when the system was suspended (since interrupts didn't run during suspension and if clock interrupts occurred before here, they would do nothing good. acpi_sleep_delay is just a forced delay during suspension. If it works at all, then updates of time_second have halted before the delay, and the arg is our best guess of the actual suspension time. IIRC, the arg is only used in the unlikely event that reading the RTC fails. pmtimer doesn't bother passing a nonzero arg. Bugs here are masked by other bugs: - at least x86 RTC-reading code is sloppy and only reads the RTC to the nearest second. So we expect an error of a second or 2 after suspend/ resume. This is too large for ntpd to fix well, so the best that can happen (without a resume script that repeats boot time ntpd initialization) is that ntpd is running and quickly notices that the offset is too large and steps the clock to fix it. Backwards steps are very bad, so the resume code should try to set to an earlier time. - when nptd is running, the current time may have been written to the RTC quite often. This probably keeps the timecounter fixup less than a few seconds although x86 RTC-writing code is sloppy and gives another source of errors. - if an error related to the timezone offset occurs or if the timecounter fixup is not applied, then the enormous error from this might be a feature. Hopefully ntpd is running and recovers faster if the error is enormous. (I think it actually doesn't do this right. It has a limit like 900 seconds and can step more than that to act a but like ntpdate when invoked with certain options, but you normally want it to do only 1 large step. Restarting it with these options after every resume is best.) Bruce