Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 May 2010 20:12:10 -0400 (EDT)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        John Baldwin <jhb@freebsd.org>
Cc:        Rick Macklem <rmacklem@freebsd.org>, Robert Watson <rwatson@freebsd.org>, fs@freebsd.org
Subject:   Re: [PATCH] Better handling of stale filehandles in open() in the NFS client
Message-ID:  <Pine.GSO.4.63.1005192004370.8867@muncher.cs.uoguelph.ca>
In-Reply-To: <201005191144.00382.jhb@freebsd.org>
References:  <201005191144.00382.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help


On Wed, 19 May 2010, John Baldwin wrote:

> One of the things the NFS client does to provide close-to-open consistency is
> that the client mandates that at least one ACCESS or GETATTR RPC is sent over
> the wire as part of every open(2) system call.  However, we currently only
> enforce that during nfs_open() (VOP_OPEN()).  If nfs_open() encounters a stale
> file handle, it fails the open(2) system call with ESTALE.
>
> A much nicer user experience is for nfs_lookup() to actually send the ACCESS
> or GETATTR RPC instead.  If that RPC fails with ESTALE, then nfs_lookup() will
> send a LOOKUP RPC which will find the new file handle (assuming a rename has
> caused the file handle for a given filename to change) and the open(2) will
> succeed with the new file handle.  I believe that this in fact used to happen
> quite often until I merged a change from Yahoo! which stopped flushing cached
> attributes during nfs_close().  With that change an open() -> close() ->
> open() sequence in quick succession will now use cached attributes during the
> lookup and only notice a stale filehandle in nfs_open().
>
> This can lead to some astonishing behavior.  To reproduce, run 'cat
> /some/file' in an loop every 2 seconds or so on an NFS client.  In another
> window, login to the NFS server and replace /some/file with /some/otherfile
> using mv(1).  The next cat in the NFS client window will usually fail with
> ESTALE.  The subsequent cat will work as it will relookup the filename and
> find the new filehandle.
>

Not astonishing at all:-) That's just NFS not having any cache coherency
protocol. (Many moons ago, I tried via nqnfs, but nobody cared.:-)
Btw, many server's don't change a file handle upon a rename and it was
once considered bad form to do so, but nowadays some don't and some do.

> The fix I came up with is to modify the NFS client lookup routine.  Before we
> trust a hit in the namecache, we check the attributes to see if we should
> trust the namecache hit.  What my patch does is to force that attribute check
> to send a GETATTR or ACCESS RPC over the wire instead of using cached
> attributes when doing a lookup on the last component of an ISOPEN lookup (so a
> lookup for open(2) or execve(2)).  This forces the ESTALE error to occur
> during the VOP_LOOKUP() stage of open(2) instead of VOP_OPEN().
>
> Thoughts?
>

It sounds fine but seems like it's going to increase the Getattr RPC cnt
since nfs_open() invalidates the attribute cache for some cases?

Did you happen to try something like a "make buildworld" with and
without the patch and compare RPC counts?

I'd say sounds great so long as the RPC counts don't go up much. If
they do, I suspect somebody won't be happy. (When I talked to Alfred
last week, all Juniper cares about is build performance and doesn't
care diddly w.r.t. coherence between multiple clients/client and server.)

Have fun with it, rick



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.GSO.4.63.1005192004370.8867>