From owner-freebsd-arch@FreeBSD.ORG Wed Dec 17 14:58:53 2014 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CB14478B for ; Wed, 17 Dec 2014 14:58:53 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A43D597B for ; Wed, 17 Dec 2014 14:58:53 +0000 (UTC) Received: from ralph.baldwin.cx (pool-173-70-85-31.nwrknj.fios.verizon.net [173.70.85.31]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id E12BBB97A; Wed, 17 Dec 2014 09:58:51 -0500 (EST) From: John Baldwin To: Jilles Tjoelker Subject: Re: Change default VFS timestamp precision? Date: Wed, 17 Dec 2014 09:40:01 -0500 Message-ID: <2034186.iLaW9EGnEt@ralph.baldwin.cx> User-Agent: KMail/4.14.2 (FreeBSD/10.1-STABLE; KDE/4.14.2; amd64; ; ) In-Reply-To: <20141216233844.GA1490@stack.nl> References: <201412161348.41219.jhb@freebsd.org> <20141216233844.GA1490@stack.nl> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 17 Dec 2014 09:58:52 -0500 (EST) Cc: arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Dec 2014 14:58:54 -0000 On Wednesday, December 17, 2014 12:38:44 AM Jilles Tjoelker wrote: > On Tue, Dec 16, 2014 at 01:48:41PM -0500, John Baldwin wrote: > > We still ship with vfs.timestamp_precision=0 by default meaning that > > VFS timestamps have a granularity of one second. It is not unusual on > > modern systems for multiple updates to a file or directory to occur > > within a single second (and thus share the same effective timestamp). > > This can break things that depend on timestamps to know when something > > has changed or is stale (such as make(1) or NFS clients). On hardware > > that has a cheap timecounter, I we should use the most-precise > > timestamps (vfs.timestamp_precision=3). However, I'm less sure of > > what to do for other cases such as i386/amd64 when not using TSC, or > > on other platforms. OTOH, perhaps you aren't doing lots of heavy I/O > > access on a system with a slow timecounter (or if you are doing heavy > > I/O, slow timecounter access won't be your bottleneck)? > > > > I can think of a few options: > > 1) Change vfs.timestamp_precision default to 3 for all systems. > > > > 2) Only change vfs.timestamp_precision default to 3 for amd64/i386 using > > an > > > > #ifdef. > > > > 3) Something else? > > Although some breakage may be caused, increasing precision sounds fine > to me, but only to the level of microseconds, since there is no way to > set a timestamp to the nanosecond (this would be futimens/utimensat). It > is easy to be surprised when cp -p creates an file that appears older > than the original. Note that vfs_timestamp() always returns a timespec, but 2 would do microseconds. The important difference for settings >= 2 is that it queries the timecounter on each call rather than using a global value that is only updated either once a second or once a millisecond or so. > To avoid cross-arch surprises with applications that use > second-resolution APIs, either all or no architectures should generate > timestamps more accurate than seconds. Actually, it will improve our interoperability with other OS's that already use sub-second timestamps when sharing filesystems over NFS, for example. > There is no benefit for the particular case of make(1), since it only > uses timestamps in seconds. My bad for not checking that further but for assuming make would be impacted. The use case I _am_ familiar with is NFS servers and NFS v3 clients that depend on the mtime of a directory to know when the lookup cache for a directory can be invalidated. Our NFS client now defaults to only trusting cached lookups for 60 seconds to workaround races due to seconds-granularity in timestamps from some NFS servers at the cost of reducing its effectiveness by a fair amount. Note that Isilon already defaults vfs.timestamp_precision to 3 on their appliances, and I recently convinced the folks at TrueNAS to do the same. However, it would also make stock FreeBSD NFS servers more reliable for NFS v3 if we changed our default. -- John Baldwin