Date: Wed, 19 May 2010 20:12:10 -0400 (EDT) From: Rick Macklem <rmacklem@uoguelph.ca> To: John Baldwin <jhb@freebsd.org> Cc: Rick Macklem <rmacklem@freebsd.org>, Robert Watson <rwatson@freebsd.org>, fs@freebsd.org Subject: Re: [PATCH] Better handling of stale filehandles in open() in the NFS client Message-ID: <Pine.GSO.4.63.1005192004370.8867@muncher.cs.uoguelph.ca> In-Reply-To: <201005191144.00382.jhb@freebsd.org> References: <201005191144.00382.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 19 May 2010, John Baldwin wrote: > One of the things the NFS client does to provide close-to-open consistency is > that the client mandates that at least one ACCESS or GETATTR RPC is sent over > the wire as part of every open(2) system call. However, we currently only > enforce that during nfs_open() (VOP_OPEN()). If nfs_open() encounters a stale > file handle, it fails the open(2) system call with ESTALE. > > A much nicer user experience is for nfs_lookup() to actually send the ACCESS > or GETATTR RPC instead. If that RPC fails with ESTALE, then nfs_lookup() will > send a LOOKUP RPC which will find the new file handle (assuming a rename has > caused the file handle for a given filename to change) and the open(2) will > succeed with the new file handle. I believe that this in fact used to happen > quite often until I merged a change from Yahoo! which stopped flushing cached > attributes during nfs_close(). With that change an open() -> close() -> > open() sequence in quick succession will now use cached attributes during the > lookup and only notice a stale filehandle in nfs_open(). > > This can lead to some astonishing behavior. To reproduce, run 'cat > /some/file' in an loop every 2 seconds or so on an NFS client. In another > window, login to the NFS server and replace /some/file with /some/otherfile > using mv(1). The next cat in the NFS client window will usually fail with > ESTALE. The subsequent cat will work as it will relookup the filename and > find the new filehandle. > Not astonishing at all:-) That's just NFS not having any cache coherency protocol. (Many moons ago, I tried via nqnfs, but nobody cared.:-) Btw, many server's don't change a file handle upon a rename and it was once considered bad form to do so, but nowadays some don't and some do. > The fix I came up with is to modify the NFS client lookup routine. Before we > trust a hit in the namecache, we check the attributes to see if we should > trust the namecache hit. What my patch does is to force that attribute check > to send a GETATTR or ACCESS RPC over the wire instead of using cached > attributes when doing a lookup on the last component of an ISOPEN lookup (so a > lookup for open(2) or execve(2)). This forces the ESTALE error to occur > during the VOP_LOOKUP() stage of open(2) instead of VOP_OPEN(). > > Thoughts? > It sounds fine but seems like it's going to increase the Getattr RPC cnt since nfs_open() invalidates the attribute cache for some cases? Did you happen to try something like a "make buildworld" with and without the patch and compare RPC counts? I'd say sounds great so long as the RPC counts don't go up much. If they do, I suspect somebody won't be happy. (When I talked to Alfred last week, all Juniper cares about is build performance and doesn't care diddly w.r.t. coherence between multiple clients/client and server.) Have fun with it, rick
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.GSO.4.63.1005192004370.8867>