Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 15 Nov 2007 10:10:23 -0800 (PST)
From:      Mohan Srinivasan <mohan_srinivasan@yahoo.com>
To:        Timo Sirainen <tss@iki.fi>, Robert Watson <rwatson@FreeBSD.org>
Cc:        Adam McDougall <mcdouga9@egr.msu.edu>, freebsd-current@FreeBSD.org, mohans@FreeBSD.org
Subject:   Re: link() not increasing link count on NFS server
Message-ID:  <698405.85667.qm@web31809.mail.mud.yahoo.com>
In-Reply-To: <20071115135734.O82897@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Robert

The code you cite, which launches a lookup on the receipt 
of an EEXIST in nfs_link() is a horrible hack that needs
to be removed. I always wanted to remove it but did not 
want to stir up controversy.

The logic predates the NFS/UDP duplicate request cache, 
which all NFS servers will support. The NFS dupreq cache 
caches the replies for non-idempotent operations and will
replay the cached response if a non-idenpotent operation
is retransmitted. This works around spurious errors in the
event the NFS response was lost, of course. The dupreq cache 
appeared in most NFS server implementations in late 1989. 

There is no justification for the logic that the FreeBSD NFS
client has at the end of these ops. In fact it breaks more
things that it fixes. At Yahoo!, we had a group that was 
doing locking by creating lockfiles and checking for the 
existence of these lockfiles. As you can imagine, that application
broke over FreeBSD NFS. I worked around this in FreeBSD's Yahoo!
implementation.

I have not looked at the original link bug reported, but I would
wholeheartedly endorse ripping out the "launch a lookup on a
an error in these ops" in all of the NFS ops and just return
the error/or success returned by the original NFS op.

mohan

--- On Thu, 11/15/07, Robert Watson <rwatson@FreeBSD.org> wrote:

> From: Robert Watson <rwatson@FreeBSD.org>
> Subject: Re: link() not increasing link count on NFS server
> To: "Timo Sirainen" <tss@iki.fi>
> Cc: "Adam McDougall" <mcdouga9@egr.msu.edu>, freebsd-current@FreeBSD.org, mohans@FreeBSD.org
> Date: Thursday, November 15, 2007, 6:05 AM
> On Thu, 15 Nov 2007, Timo Sirainen wrote:
> 
> > On Thu, 2007-11-15 at 12:39 +0000, Robert Watson
> wrote:
> >
> >>> or Solaris NFS clients. Basically, Timo
> (cc'ed) came up with a small test 
> >>> case that seems to indicate sometimes a link()
> call can succeed while the 
> >>> link count of the file will not increase.  If
> this is ran on two FreeBSD 
> >>> clients from the same NFS directory, you will
> occasionally see "link() 
> >>> succeeded, but link count=1".  I've
> tried both a Netapp and a FreeBSD NFS
> > ..
> >> My guess, and this is just a hand-wave, is that
> the attribute cache in the 
> >> NFS client isn't being forced to refresh, and
> hence you're getting the old 
> >> stat data back (and perhaps there's no GETATTR
> on the wire, which might 
> >> hint at this).  If you'd like, you can post a
> link to the pcap capture file 
> >> and one of us can take a look, but I've found
> NFS RPCs to be surprisingly 
> >> readable in Wireshark so you might find it sheds
> quite a bit of light.
> >
> > Actually the point was that link() returns success
> even though in reality it 
> > fails. The fstat() was just a workaround to catch this
> case and treat link 
> > count 1 as if link() had failed with EEXIST. After
> that I had no more 
> > problems with locking.
> >
> > I noticed this first because my dotlocking was failing
> to lock files 
> > properly. I also added fchown() to flush attribute
> cache after link() and 
> > before fstat(), it gives the same link count=1 reply.
> 
> Indeed, and inspection of nfs_vnops.c:nfs_link(): finds:
> 
> 1772         /*
> 1773          * Kludge: Map EEXIST => 0 assuming that it
> is a reply to a retry.
> 1774          */
> 1775         if (error == EEXIST)
> 1776                 error = 0;
> 1777         return (error);
> 
> Neither Linux nor Solaris appears to have this logic in the
> client.  I assume 
> this is, as suggested, to work around UDP retransmissions
> where the reply is 
> lost rather than the request.  It appears to exist in
> revision 1.1 of 
> nfs_vnops.c, so came in with 4.4BSD in the initial import,
> but doesn't appear 
> in NetBSD so I'm guessing they've removed it.  It
> could well be we should be 
> doing the same.  I've added Mohan to the CC line in
> case he has any input on 
> this point.
> 
> Robert N M Watson
> Computer Laboratory
> University of Cambridge



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?698405.85667.qm>