From owner-freebsd-hackers@freebsd.org Fri Mar 16 19:41:49 2018 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E8CFBF43883 for ; Fri, 16 Mar 2018 19:41:48 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au [211.29.132.246]) by mx1.freebsd.org (Postfix) with ESMTP id 31FFD71213; Fri, 16 Mar 2018 19:41:47 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 5F82E42D588; Sat, 17 Mar 2018 06:41:45 +1100 (AEDT) Date: Sat, 17 Mar 2018 06:41:45 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Alan Somers cc: FreeBSD Hackers , bde@freebsd.org, Poul-Henning Kamp Subject: Re: Is it time to expose timespecsub and friends to userland? In-Reply-To: Message-ID: <20180317042931.N23257@besplex.bde.org> References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=VJytp5HX c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=4Y-zwQCBEnCh_dXso9oA:9 a=Ncd7HRGRlZGPC89y:21 a=Od9-zSrT0hJy28m1:21 a=CjuIK1q_8ugA:10 X-Mailman-Approved-At: Fri, 16 Mar 2018 20:43:55 +0000 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 16 Mar 2018 19:41:49 -0000 On Fri, 16 Mar 2018, Alan Somers wrote: > There are at least 19 system calls that have timeout arguments specified as > struct timespec [1]. "struct stat" describes file timestamps as timespecs > as well [2]. setsockopt(2), sched_rr_get_interval(2), > geom_stats_snapshot_timestamp(3) and pmclog_read(3) use timespecs for other > purposes. The sysctl node kern.crypto_stats exposes timespecs. sudo > records timespecs in its timestamp files. Who know how many other ports do > something similar; I'm not going to grep them all. > > And yet, FreeBSD provides no way to manipulate this structure. Every > program that uses them has to roll its own version of pretty much the same > functions. NetBSD does [3], and has for a long time. In fact, NetBSD's > timespecsub is old enough to drink [4]. But in FreeBSD, those functions > languish inside of an "#ifdef KERNEL". phk added them in r35029 with the > comment "XXX: These may change!". But it's been nearly 20 years, so I > don't think they're going to change any more. > > Shall we finally expose them to userland and add a man page? Let the > bikeshed begin! If the flame isn't too bad, I'll do it. These functions fill a much needed gap. POSIX provides the correct number of APIs for manipulating the simple timeval and timespec structs (none). glibc also doesn't support these functions or any others for manipulating these structs, at least in old versions. The easiest way to manipulate timevals in userland was to convert them microseconds in floating point. 53-bit doubles work up to 4.5G seconds = 142 years without losing precision. Unfortunately, this doesn't work for timespecs, since 0.142 years is not enough. However integer types with at least 64 bits have been standard since C99. uint64_t or uintmax_t holds at least 184G seconds = 584 years. sys/time.h is now polluted with other conversion functions: - bintime*() (under __BSD_VISIBLE) - field names like sec and frac in the application namespace. The older timeval and timespec structs are missing the bug of not having prefixes for field names. - sbintime*() and SBT*. Also under __BSD_VISIBLE, but no user APIs use these AFAIK, so they should be under _KERNEL. - sb* and bt* for sbintimes. bt is a better prefix than bintime, but is more likely to conflict with an application name, especially since it is missing an underscore. - us*, ns*, ms* - timespec2bintime() and similar bad names with timespec and bintime spelled out but "to" punned to "2". timespec* might actually the in the namespace reserved for this file, but it is too verbose. - OTOH, there is tstosbt(). I talked someone into not punning "2" in this. But it is hard to read without some underscores, and has larger namespace problems from being shorter. The latter is not a problem if it is kernel- only. Older timespec*() macros are still under _KERNEL, as you noticed. Inlines are correctly not used for the older APIs. This reduces their pollution. Even older timer*() macros are also under _KERNEL. These are marked NetBSD/OpenBSD compatible. These have even worse names -- they are for timevals, not for generic or specific 'timer' structs. There are also problems with arg order and the number of args. timeradd() takes 3 args (tvp, uvp, vvp). timespecadd() takes 2 args in the opposite order (vvp, uvp). It is easier for me to just to the addition than remember the order. Namespace pollution continues after the timespec and timeval macros: mainly: - struct clockinfo; all of its fields are missing prefixes. This is not even under __BSD_VISIBLE. Now looking at only !__BSD_VISIBLE && !_KERNEL code: - earlier there is some nested pollution, struct timezone and its fields which at least have tz_ prefixes, and DST*. - the NetBSD/OpenBSD mistakes are actually under #ifndef _KERNEL, so they pollute userland but not the kernel. The kernel mostly uses non-macro non-inline functions for these. - lots of pollution from the nested include of . The contents of sys/time.h and time.h are still soft of backwards. - explicity pollution from the nested include of . IIRC, POSIX was broken to allow this. The earlier include of sys/types.h already included this nested in the __BSD_VISIBLE case. Example of converting unportable timer*() to floating point: kdump/kdump.c: XX if (timestamp & TIMESTAMP_ELAPSED) { XX if (prevtime_e.tv_sec == 0) XX prevtime_e = kth->ktr_time; XX timersub(&kth->ktr_time, &prevtime_e, &temp); XX printf("%jd.%06ld ", (intmax_t)temp.tv_sec, XX temp.tv_usec); Replace the last 3 lines by: printf("%.6f ", kth->ktr_time.tv_sec - prevtime_e.tv_sec + 1e-6 * (kth->ktr_time.tv_usec - prevtime_e.tv_usec); If this isn't accurate enough, then the calculation in intmax_t arithmetic is needed and messier: intmax_t nsec; nsec = (kth->ktr_time.tv_sec - prevtime_e.tv_sec) * 1000000 + (kth->ktr_time.tv_usec - prevtime_e.tv_usec); printf("%jd.%06jd ", nsec / 1000000, usec % 1000000); (Actually even messier if nsec can be < 0 -- see below). XX } XX if (timestamp & TIMESTAMP_RELATIVE) { XX if (prevtime.tv_sec == 0) XX prevtime = kth->ktr_time; XX if (timercmp(&kth->ktr_time, &prevtime, <)) { XX timersub(&prevtime, &kth->ktr_time, &temp); XX sign = "-"; XX } else { XX timersub(&kth->ktr_time, &prevtime, &temp); XX sign = ""; XX } XX prevtime = kth->ktr_time; XX printf("%s%jd.%06ld ", sign, (intmax_t)temp.tv_sec, XX temp.tv_usec); These complications are to avoid misprinting for example a timeval of { tv_sec = -2; tv_nsec = 999999; } as -2.999999. It is actually -(1.000001). timersub() is foot-shooting to give an unwanted representation. Simply calculating the difference in floating point works the same as before. Even C's broken division works right for this case (division should round the integer part down, giving the same problem as for timevals, but its brokenness is that it rounds towards 0). XX } The cases where using timer*() is not just worst are when you have to convert a value in usec back to a timeval. Then an extra conversion step is necessary. Something like: struct timeval finish, prev, resid; ... timersub(&finish, &prev, &resid); /* Should have error handling for resid here. */ select(..., &resid); which would be converted to: usec = (finish.tv_sec - prev.tv_sec) * (intmax_t)1000000 + (finish.tv_usec - prev.tv_usec); /* Error checks like usec >= 0 are easier with a single number. */ /* Extra conversion: */ resid.tv_sec = usec / 1000000; resid.tv_usec = usec % 1000000; select(..., &resid); Note that if usec < 0, C's broken division by negative numbers usually gives an invalid timeval with tv_usec < 0, so the error checking is more needed. This the reverse of the problem for printing negative timevals. Negative timeouts should be treated as 0, but invalid negative timeouts should be errors. sbintimes are essentially optimized versions of representing everything in nsec, with minimal conversions to struct types. Unfortunately all of the standard time APIs except difftime() use struct types. Bruce