Date: Sat, 11 Feb 2012 05:25:28 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Sergey Kandaurov <pluknet@gmail.com> Cc: Ed Schouten <ed@80386.nl>, fs@FreeBSD.org Subject: Re: Increase timestamp precision? Message-ID: <20120211042121.B3653@besplex.bde.org> In-Reply-To: <CAE-mSOK2fo=PsvyQWW1Nz4XPqcr7fKDNCvVjHsUvR2uYmuqFMw@mail.gmail.com> References: <20120210135527.GR1860@hoeg.nl> <CAE-mSOK2fo=PsvyQWW1Nz4XPqcr7fKDNCvVjHsUvR2uYmuqFMw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 10 Feb 2012, Sergey Kandaurov wrote: > On 10 February 2012 17:55, Ed Schouten <ed@80386.nl> wrote: >> Hi all, >> >> It seems the default timestamp precision sysctl >> (vfs.timestamp_precision) is currently set to 0 by default, meaning we >> don't do any sub-second timestamps on files. Looking at the code, it >> seems that vfs.timestamp_precision=1 will let it use a cached value with >> 1 / HZ precision and it looks like it should have little overhead. >> >> Would anyone object if I were to change the default from 0 to 1? Sure. The setting of 1 is too buggy to use in its current implementation. I don't know of any fixed version, so there is little experience with a usable version. I also wouldn't use getnanotime() for anything. But I like nanotime(). % void % vfs_timestamp(struct timespec *tsp) % { % struct timeval tv; % % switch (timestamp_precision) { % case TSP_SEC: % tsp->tv_sec = time_second; % tsp->tv_nsec = 0; This gives seconds precision. It is correct. % break; % case TSP_HZ: % getnanotime(tsp); % break; I must have been asleep when I reviewed this for jdp in 1999. This doesn't give 1/HZ precision. It gives nanoseconds precision with garbage in the low bits, and about 1/HZ accuracy. To fix it, round down to 1/HZ precision, or at least to microseconds precision. The garbage in the low bits matters mainly because there is no way to preserve it. utimes(2) only supports microseconds precision. % case TSP_USEC: % microtime(&tv); % TIMEVAL_TO_TIMESPEC(&tv, tsp); % break; This gives microseconds precision, but in a silly way. It should call nanotime() and then round down to microseconds precision. % case TSP_NSEC: % default: The default should be an error. % nanotime(tsp); % break; % } % } I mostly use TSP_SEC, but there are some buggy file systems and/or utilities (cvsup?, or perhaps just scp from a system using a different timestamp precision) that produce sub-seconds precision. I notice this when I veryify timestamps in backups. The backup formats support microseconds precision at best, so any extra precision in the files in active file systems gives a verification failure. > [Yep, sorry I didn't read this mail before replying to your another mail.] > > I am for this idea. Increasing vfs.timestamp_precision will allow > to use nanosecond precision for all those *stat() and *times() > syscalls which operate on struct timespec. > > FWIW, NetBSD uses only nanotime() inside vfs_timestamp() since its > initial appearance in 2006. Does NetBSD's nanotime() have full nsec precision and hardware slowness? There is no hardware yet that can deliver anywhere near nsec accuracy, so the precision might as well be limited to usec. i8254 timecounters take/took 5-50 usec just to read. ACPI-fast is relatively worse on today's faster CPUs (1-2 usec). The non-serializing TSC used to take only ~10 instructions on Athlons, but it is non-serializing and was always non-P-state-invariant. P-state-invariant versions take much longer (seems to be about 50 cycles in hardware and another 50 in software for core2), and TSC-low intentionally wastes about 7 low bits, so its precision is about 64 nsec which is about the same time as nanotime() takes to read it. I use TSP_NSEC only for POSIX conformance tests, to break the tests finding of the bug that even TSP_SEC is broken. It is broken because time_second is incoherent with the time(3). time_second and the time reported by all the get*time() functions lags the time reported by the (non-get)*time(), and the lag is random relative to seconds (and other) boundaries, so rounding to a seconds (or other) boundary gives different results. These differences are visible to applications doing tests like: touch(file); stat(file); sleep(1); touch(file); stat(file); assert(file_mtime_increased_y_at_least_1_second); This should also show the leap seconds bug in POSIX times (the file time shouldn't change across a leap second). Some of the tests do lots of file timestamp changing (I also have to turn off my usual optimization of mounting with noatime to get them to pass). They run fast enough even with TSP_NSEC, at least if the timecounter is a fast TSC. File time updates just don't happen enough for their speed to matter much, provided they are cached enough. ffs uses the mark-for-update caching strategy which works well. It avoids not only writing to disk, but even reading the timer a lot. Some other filesystems like devfs are not careful about this, so the slowness of silly operations like dd with a block size of 1 on /dev/zero to /dev/null becmes even more extreme if TSP_NSEC or TSP_USEC is used and the timecounter is slow. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120211042121.B3653>