Date: Tue, 23 Dec 2014 10:25:27 +1100 From: Richard Perini <rpp@ci.com.au> To: John Baldwin <jhb@freebsd.org> Cc: freebsd-stable@freebsd.org Subject: Re: NFS negative name caching and amd Message-ID: <20141222232527.GA52306@odi.ci.com.au> In-Reply-To: <201412221004.48504.jhb@freebsd.org> References: <20141221102746.GA11278@odi.ci.com.au> <201412221004.48504.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Dec 22, 2014 at 10:04:48AM -0500, John Baldwin wrote: > On Sunday, December 21, 2014 5:27:46 am Richard Perini wrote: > > > > We're struggling with an NFS negative name caching issue that results in > > a file created by an NFS client 'A' being invisible on client 'B' for up > > to client A's negnametimeo value. In our scenario, a process on client > > A creates a file, and passes a message to another process which may > > run on client B. The second process expects the file created by A to > > be available. > > Which NFS server are you using? If it is a FreeBSD NFS server, try changing > vfs.timestamp_precision to 2 (or 3) and seeing if that reduces the amount of > time you have to wait until the directory's ac timeout. Yes, we are running FreeBSD on the server machines. Unfortunately, our process really can't tolerate a delay of any length - either the file is present or its not. > Another possible the fix is to be careful to not open the file until you know > it exists if you still want to keep the reduced LOOKUP RPC load from caching > negative lookups. We have coded around the most common failure points with retry logic, but this is a hack, and there are some third party libraries involved that are not practical to fix in this manner. > > We're running a mix of 9-stable and 10-stable machines, and the problem is > > common to both. > > > > The obvious fix is to set the nfs mount option 'negnametimeo' to 0, but > > unfortunately we also have 'amd' in the picture (which we also need in our > > environment). Amd doesn't understand negnametimeo and ignores it, leaving > > it set to the system default of 60 seconds (as shown by nfsstat -m). > > Have you tried autofs for 10-stable? Is it able to pass this option to NFS > if you use it? If that works, I would prefer that to be the long term > solution for this. I'm not a huge fan of adding kernel options to override > each NFS default mount option if we can help it. I just ran up autofs and automountd on 10-stable, set the negnametimeo option in auto_master and it works a treat. However it will be quite some time before we're able to shift off 9 which leaves us with the kernel option as the easiest path. I'd point out that the nfs client code in /usr/src/sys/fs/nfsclient/nfsmount.h is already coded to allow override: ifndef NFS_DEFAULT_NEGNAMETIMEO #define NFS_DEFAULT_NEGNAMETIMEO 60 #endif so all that is required is the entry in the "options" file. Naturally we can add that ourselves (the beauty of open source :-) but it would be the only change to the native FreeBSD code for us, so of course we'd prefer to see it in the tree. Regards, and compliments of the season. --R
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20141222232527.GA52306>