Date: Fri, 22 Mar 2019 00:53:26 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: trasz@freebsd.org Cc: bugs@freebsd.org Subject: Re: [Bug 236702] KERN_UPTIME not updated after resume Message-ID: <20190321233434.Y1924@besplex.bde.org> In-Reply-To: <bug-236702-227@https.bugs.freebsd.org/bugzilla/> References: <bug-236702-227@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 21 Mar 2019 bugzilla-noreply@freebsd.org wrote: > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236702 > > Bug ID: 236702 > Summary: KERN_UPTIME not updated after resume > ... > Reporter: trasz@FreeBSD.org > > It seems the time returned by clock_gettime(CLOCK_UPTIME) doesn't get updated > on resume - in other words, when you reboot, then suspend to ram, then wait an > hour and resume, the uptime, as shown by uptime(1), won't account for that > hour. This is different from eg OSX. This is a longstanding bug. CLOCK_UPTIME shouldn't exist (it is one of 3 unportable aliases for the POSIX CLOCK_MONOTONIC clock id. It is actually documented to be broken in clock_gettime(2) ("it ... increments monotonically in SI seconds __while the machine is running__") and is only an alias since CLOCK_MONOTONIC is broken enough to give the same misbehaviour so each reduces to an alias of the other. CLOCK_MONOTONIC is not documented to be broken in clock_gettime(2), except the documentation is fuzzy enough to allow anything ("it Increments in SI seconds"). Of course, that is impossible; it can only increment as if in some approximation to SI seconds, and in practice the approximation is very variable (it has a large error while the machine is not running, and large steps to catch up, and normally tiny frequency adjustments while the machine is running, and sometimes not so tiny frequency adjustments, and tiny and not so tiny steps...). POSIX specifies a little more, but not enough about accuracy ("... represents __the__ amount of time (in seconds and nanoseconds since an unspecified point in the past... This time does not change after system start-up time"). Of course __the__ amount is too precise to specify (it depends on the frame of reference, and it is unclear what this is for a timer a significant fraction of 1 light nanosecond away from memory used to store the result. In practice, errors of more like 1 millisecond are common due to long-term drift at variable rates of 1-10 ppm. Anyway, it is clear that CLOCK_MONOTONIC is not allowed to stop while the machine is not running. It must act as if it was never stopped, by stepping it forwards on resume before it is read. Otherwise, it acts as if the reference point moved. Stepping of CLOCK_REALTIME has a related bug. This is implemented by moving the reference point. The reference point is kern.boottime. Moving it maintains the bogus invariant that realtime == uptime + boottime. Since uptime is not updated on resume but realtime usually is updated on resume, boottime must be updated bogusly on resume to maintain this invariant. Moving boottime is correct in one case: when the initial realtime is wrong, which it often is because the hardware clock used to set the initial time is inaccurate. Then when ntpd or the user steps the real time to fix this, the boot time should be stepped too. After that, the boot time is as correct as possible, so it should not be changed. Micro-adjustments by ntpd don't move it, but steps by ntpd or the user do move it. The steps may be of a few seconds to fix up for drift, or many hours to fix up on resume. Errors in the hardware clock include: - it being only accurate to the nearest second. This gives errors of at best at most 0.5 seconds. ntpd should consider 0.5 seconds to be a large error and step to catch up. - drift of the clock while the machine was powered off. 1 second/day is typical. - the hardware clock being on local time gives an error of the timezone difference. This is normally fixed up by adjkerntz stepping the real time. Timeouts are also slightly broken across suspend-resume. They are relative, so they don't count time spent suspended. This is one way to prevent a thundering herd of timeouts occurring on resume. But I think the thundering herd should occur. Just rate limit it a bit. FreeBSD still has the bogus unsupported option APM_FIXUP_CALLTODO related to this. This stopped compiling slightly before it was committed. It has rotted for 21 years since then. It is supposed to give the thundering herd. It doesn't attempt to give rate limiting. Fixing up the real time on resume doesn't work very well either. On x86, the best that can happen is approximately: - use the hardware clock as at boot time but without the timezone difference error, and usually with a smaller error from drift while not running, to step the real time. The error is typically 1 second. - ntpd should be restarted in /etc/rc.resume to finish the stepping, but I've never seen this done. It should consider the 1 second large and do a step and not slew. - if ntpd is not used, then at least use ntpdate (or timed, or a writwatch :-)) to check and fix up the clock. - if ntpd is running, then it will not be as aggressive as when it started up. If you are lucky, then the error will be large enough for ntpd to step. - while the real time is being fixed up, it is more unreliable than usual. Resume usually takes several seconds and almost everything should wait for it to complete, but the system doesn't take much care with this AFAIK Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190321233434.Y1924>