Date: Thu, 8 May 2014 09:37:37 -0600 From: Alan Somers <asomers@freebsd.org> To: Bruce Evans <brde@optusnet.com.au> Cc: "svn-src-head@freebsd.org" <svn-src-head@freebsd.org>, "svn-src-all@freebsd.org" <svn-src-all@freebsd.org>, "src-committers@freebsd.org" <src-committers@freebsd.org>, Alan Somers <asomers@freebsd.org>, Jilles Tjoelker <jilles@stack.nl> Subject: Re: svn commit: r265472 - head/bin/dd Message-ID: <CAOtMX2iMCXqfXCi=a32m2f4aubeDTeBhYwq%2B9eZst64J6QzoEg@mail.gmail.com> In-Reply-To: <20140508111443.S1000@besplex.bde.org> References: <201405062206.s46M6dxW060155@svn.freebsd.org> <20140507113345.B923@besplex.bde.org> <CAOtMX2h_%2B1G18Nv5JvDE0H7_TZ96p81JotOwhq1Jm-dOOeahPw@mail.gmail.com> <20140507202623.GA14233@stack.nl> <20140508111443.S1000@besplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, May 7, 2014 at 9:39 PM, Bruce Evans <brde@optusnet.com.au> wrote: > On Wed, 7 May 2014, Jilles Tjoelker wrote: > >> On Wed, May 07, 2014 at 12:10:31PM -0600, Alan Somers wrote: >>> >>> On Tue, May 6, 2014 at 9:47 PM, Bruce Evans <brde@optusnet.com.au> wrote: >>>> >>>> On Tue, 6 May 2014, Alan Somers wrote: >>>>> >>>>> ... >>>>> >>>>> The solution is to use clock_gettime(2) with CLOCK_MONOTONIC_PRECISE >>>>> as >>>>> the >>>>> clock_id. That clock advances steadily, regardless of changes to the >>>>> system >>>>> clock. >>>>> ... >>>>> +#include <sysexits.h> >> >> >>>> Use of <sysexits.h> is a style bug. It is not used in BSD or KNF code >>>> like dd used to be. >> >> >>> sysexits.h is recommended by the err(3) man page. Is that >>> recommendation meant to apply selectively, or is it obsolete, or is >>> some sort of edit war being waged by man page authors? > > > Bug in the err(3) man page. Sort of an edit war. Just 2 FreeBSD > committers liked sysexits and used it in their code and added a > recommendation to use it in some man pages. But it has negative > advantages, and normal BSD programs don't use it. It has been > edited in and out of style(9). > > >> The recommendation for <sysexits.h> was incompletely removed, yes. > > > It is still in err(3), and sysexits(3) still justifies itself by > pointing to partly-removed words in style(9). > > err(3) is the last place that should recommend using sysexits. err() > gives a nice way of encouraging text descriptions for all exits. > With text descriptions, there is almost no need for cryptic numeric > exit codes. Only sets of programs that communicate a little status > in the exit code should use sysexits (or perhaps their own exit > codes, or certain standard exit codes like 126 or 127 for xargs and > some other utilities). Some of the uses of the standard exit codes > are even. I don't know of any utility except possibly sendmail that > documents that it uses sysexits enough for its exit codes to be > useful for more than a binary success/fail decision. Certainly not > dd after these changes. If its use of sysexits were documented, > then the documentation would say "dd uses sysexits to report 3 errors > that can't happen; otherwise, it uses the normal 2-state exit codes > (there is a macro for them. It expands to the concise but > grammatically challenged "exits 0 on success, and >0 if an error > occurs". Here ">0" standardises the usual sloppiness of not > distinguishing codes between 1 and 127). > > sysexits(3) now says: > > % DESCRIPTION > % According to style(9), it is not a good practice to call exit(3) with > % arbitrary values to indicate a failure condition when ending a > program. > % Instead, the pre-defined exit codes from sysexits should be used, so > the > % caller of the process can get a rough estimation about the failure > class > % without looking up the source code. > > but style(9) now says: > > % Exits should be 0 on success, or 1 on failure. > % % exit(0); /* > % * Avoid obvious comments such as > % * "Exit 0 on success." > % */ > % } > > The latter is not what I asked for either. In previous discussion > of this, I think we agreed to at least mention EXIT_SUCCESS and > EXIT_FAILURE, and possibly deprecate sysexits. > > This is a weakened version of the 4.4BSD style rules, which say: > > % /* > % * Exits should be 0 on success, and 1 on failure. Don't denote > % * all the possible exit points, using the integers 1 through 300. > % */ > % exit(0); /* Avoid obvious comments such as "Exit 0 on success." > */ > > The main point of this is to disallow cryptic undocumented exit statuses. > Recommending sysexits almost reverses this. It gives cryptic undocumented > error statuses that are not even easy to decrypt for programs. Programs > can look up sysexits, but without documentation there is no guarantee that > the encoding is according to sysexits. Actually documenting use of > sysexits would make it even more painful to use. > > >>> [snip] >>>>> >>>>> - st.start = tv.tv_sec + tv.tv_usec * 1e-6; >>>>> + if (clock_gettime(CLOCK_MONOTONIC_PRECISE, &tv)) >>>>> + err(EX_OSERR, "clock_gettime"); >>> >>> [snip] >>>>> >>>>> + st.start = tv.tv_sec + tv.tv_nsec * 1.0e-9; >>>>> } >> >> >> The floating point addition starts losing precision after 8388608 >> seconds (slightly more than 97 days, a plausible uptime for a server). >> It is better to subtract the timespecs to avoid this issue. > > > No, it is better to use floating point for results that only need to > be approximate. Especially when the inputs are approximate and the > final approximation doesn't need to be very accurate. > > Floating point is good for all timespec and timeval calculations, > except in the kernel where it is unavailable. timespecs and timevals > are mostly used for timeouts, and the kernel isn't very careful about > exact timeouts. Short timeouts have inherent large inaccuracy due > to interrupt granularity and latency. Long timeouts can be relatively > more accurate, but only if the kernel is careful about them. It is > only careful in some places. No, Jilles is right. The problem isn't that dd uses doubles; it's that dd converts longs to doubles _before_ subtracting the values. That causes rounding if the tv_sec values are large. If the implementation of CLOCK_MONOTONIC ever changed to measure time since the Epoch, or something similar, then the rounding error would be extremely significant. Better to subtract the timespecs, then convert to double. > > >> With microseconds, the precision of a double is sufficient for 272 >> years, so that calculation is probably acceptable. > > > dd actually uses double, but float would be plenty. systat uses a > mixture of float and double. double througout is better because > using the smaller type float tends to give negative optimizations. > devstat uses long double. That's really silly for statistics. > On some arches, it is no different from double (so nothing can > depend on extra precision from it). On sparc64, it is a negative > optimization by a factor of hundreds. > > >>> [snip] >>> Even if nanosecond resolution isn't useful, monotonicity is. Nobody >>> should be using a nonmonotonic clock just to measure durations. I >>> started an audit of all of FreeBSD to look for other programs that use >>> gettimeofday to measure durations. I haven't finished, but I've >>> already found a lot, including xz, ping, hastd, fetch, systat, powerd, >>> and others. I don't have time to fix them, though. Would you be >>> interested, or do you know anyone else who would? >> >> >> I have a local patch for time(1). >> >> Whether the monotonic clock is right also depends on how long the >> durations typically are. For very long durations, users might refer to >> wall clocks and CLOCK_REALTIME may be more appropriate. > > > Yes, monotonic clocks are often best, but there are many bugs in this > area. The most relevant one is perhaps that CLOCK_MONOTONIC is only > monotonic. It is unclear if standards require it to have any relation > to actual time. In practice in FreeBSD, it gives the actual time that > the system is up and is not suspended. It is thus especially unusable > for setting alarm clocks in the morning since suspension overnight is > more likely than at other times. Alarm clocks need to use real time > anyway. nanosleep() is almost unusable for setting alarm clocks due > to this problem, its bugs, and other reasons: > - nanosleep() is specified to sleep on real time, but in FreeBSD it sleeps > on monotonic time. clock_nanosleep() is specified to sleep on a > specified clock id, but is not implemented in FreeBSD. > - I don't see any way to use the broken nanosleep() for setting realtime > alarms except to take short sleeps and check the real time on waking > up. Kernel timer code does things like this internally, but not > very accurately, and for nanosleep() its sleeps are not short enough > to work and it checks the wrong clock id on waking up. > - nanosleep() takes a relative time, so even a nanosleep() that sleeps > on the correct clock id would be hard to use with an overnight timeout. > You would have to know about daylight savings adjustments and either > compensate for them up front or wake up an hour or 2 early to check > for a switch. > - there are some POSIX realtime functions that support sleeping on an > arbitrary clock id, and also support sleeping until an absolute > time. These are supported FreeBSD. I haven't actually used them. > They are sloppy in different ways than older FreeBSD timer code (and > not as up to date with the change to sbintime_t). They seem to be > unaware of daylight savings and not use short enough sleeps to work > across switches. > - nanosleep() is specified to sleep in realtime. Actually more > specifically, to use CLOCK_REALTIME for its clock id. But its interval > is relative, so it is unclear even what this means. > > Taking averages over days has similar problems. They should probably > use the monotonic system up time, not the system up time less the > system suspension time. Due to the bug of not counting suspension > time, using the real time clock is probably better. It may jump by > up to about 1 hour across daylight savings switches, but that won't > take it backwards, but the monotonic clock may fail to advance by > much more than 1 hour. > > POSIX doesn't actually teh monotonic clock to fail to advance across > suspsensions or for other reasons. From an old draft: > > % 6679 MON If the Monotonic Clock option is supported, all > implementations shall support a clock_id of > % 6680 CLOCK_MONOTONIC defined in <time.h>. This clock > represents the monotonic clock for the > % 6681 system. For this clock, the value returned by > clock_gettime( ) represents the amount of time (in > % 6682 seconds and nanoseconds) since an unspecified point in > the past (for example, system start-up > % 6683 time, or the Epoch). This point does not change after > system start-up time. The value of the > > Here "amount of time" is fuzzy, but clearly it should be in physical time > and as accurate as possible. > > FreeBSD's implementation also breaks the "unspecified point in the past" > by frobbing it to implement the real time. It is only unspecified in > POSIX. In FreeBSD, you can see it using sysctl kern.boottime and > indirectly using uptime(1). uptime (that is, w), has been changed to > use CLOCK_UPTIME, and that gives some of the long-term timing bugs > mentioned above. Suppose for example that the system booted at 1:00 am > on a certain day. The boot time is whatever it is, and shouldn't > change. It serves as the "unspecified point in the past". It is not > affected by DST switches or by micro-adjustments using adjtime() or ntpd. > However, suppose the clock drifts by 1 second and the real time is fixed > up by stepping the clock. The real time becomes correct, but the monotonic > time remains off by 1 second. This is implemented by stepping the boot > time to 1:01 am or 0:59 am. The boot time becomes wrong too. CLOCK_UPTIME > is the same as CLOCK_MONOTONIC, so it is also off by 1 second. This can > be seen in uptime(1) output. The errors may accumulate. > > Of course, the monotonic clock cannot be stepped backwards. Stepping > it foward wouldn't break it much more than leaving it off by 1 second > forever. However, the only reasonably correct implementation is to > micro-adjust it until it catches up with any steps in the realtime > clock. Only do this for small adjustments. After suspension, it > should be stepped forwards by a large amount. > > I think bad things happen to the boot time after suspension too. The > real time must be stepped forward by a large amount, and doing that > steps the boot time by a large amount. > > Similarly for booting if the realtime is initially local. It is > stepped to make it UTC. This is confusing. It happens on my > system, and sysctl kern.boottime shows the boot time > apparently-correctly. But it is correct as a local time. The boot > time is in UTC. sysctl doesn't translate to local time, so the > apparently-correct time is actually off by the step (10 hours). > > Bugs in the boot time can be fixed more easily than by micro-adjusting > the monotonic clock. Just keep the initial boot time (except adjust it > when it was initially local instead of UTC) and frob the real time > using a different variable. Export both variables so that applications > can compensate for the frobbing at the cost of some complexity. E.g., > in uptime(1): > > clock_gettime(CLOCK_UPTIME, &ts); > /* > * Actually, do the compensations in the kernel for CLOCK_UPTIME. > * It doesn't need to be monotonic. But suppose it is the same > * as the unfixed CLOCK_MONOTONIC and compensate here. > * > * Also fix the bogus variable name 'tp'. > */ > sysctl_mumble(&boottime); > sysctl_mumble(&frobbed_boottime); > uptime = ts.tv_sec +- (boottime.tv_sec - frobbed_boottime.tv_sec); > > Note that the compensation may go backwards, so this method doesn't work > in general for monotonic times. However, it can be used if the compensation > is non-negative or relatively small negative. dd could use this method. > It already has to fix up for zero times and still has parts of the old > method that fixes up for negative times. Note that the compensation may > be very large across a suspension. You might start dd, SIGSTOP it, suspend > the system and restart everything a day later. The compensation would be > about 1 day. The average from this wouldn't be very useful, but it would > be the same as if dd was stopped for a day but the system was not suspended. Wouldn't it be simpler just for the kernel to adjust CLOCK_MONOTONIC to add suspension time? -Alan > > Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2iMCXqfXCi=a32m2f4aubeDTeBhYwq%2B9eZst64J6QzoEg>
