Date: Thu, 15 Nov 2007 10:10:23 -0800 (PST) From: Mohan Srinivasan <mohan_srinivasan@yahoo.com> To: Timo Sirainen <tss@iki.fi>, Robert Watson <rwatson@FreeBSD.org> Cc: Adam McDougall <mcdouga9@egr.msu.edu>, freebsd-current@FreeBSD.org, mohans@FreeBSD.org Subject: Re: link() not increasing link count on NFS server Message-ID: <698405.85667.qm@web31809.mail.mud.yahoo.com> In-Reply-To: <20071115135734.O82897@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Robert The code you cite, which launches a lookup on the receipt of an EEXIST in nfs_link() is a horrible hack that needs to be removed. I always wanted to remove it but did not want to stir up controversy. The logic predates the NFS/UDP duplicate request cache, which all NFS servers will support. The NFS dupreq cache caches the replies for non-idempotent operations and will replay the cached response if a non-idenpotent operation is retransmitted. This works around spurious errors in the event the NFS response was lost, of course. The dupreq cache appeared in most NFS server implementations in late 1989. There is no justification for the logic that the FreeBSD NFS client has at the end of these ops. In fact it breaks more things that it fixes. At Yahoo!, we had a group that was doing locking by creating lockfiles and checking for the existence of these lockfiles. As you can imagine, that application broke over FreeBSD NFS. I worked around this in FreeBSD's Yahoo! implementation. I have not looked at the original link bug reported, but I would wholeheartedly endorse ripping out the "launch a lookup on a an error in these ops" in all of the NFS ops and just return the error/or success returned by the original NFS op. mohan --- On Thu, 11/15/07, Robert Watson <rwatson@FreeBSD.org> wrote: > From: Robert Watson <rwatson@FreeBSD.org> > Subject: Re: link() not increasing link count on NFS server > To: "Timo Sirainen" <tss@iki.fi> > Cc: "Adam McDougall" <mcdouga9@egr.msu.edu>, freebsd-current@FreeBSD.org, mohans@FreeBSD.org > Date: Thursday, November 15, 2007, 6:05 AM > On Thu, 15 Nov 2007, Timo Sirainen wrote: > > > On Thu, 2007-11-15 at 12:39 +0000, Robert Watson > wrote: > > > >>> or Solaris NFS clients. Basically, Timo > (cc'ed) came up with a small test > >>> case that seems to indicate sometimes a link() > call can succeed while the > >>> link count of the file will not increase. If > this is ran on two FreeBSD > >>> clients from the same NFS directory, you will > occasionally see "link() > >>> succeeded, but link count=1". I've > tried both a Netapp and a FreeBSD NFS > > .. > >> My guess, and this is just a hand-wave, is that > the attribute cache in the > >> NFS client isn't being forced to refresh, and > hence you're getting the old > >> stat data back (and perhaps there's no GETATTR > on the wire, which might > >> hint at this). If you'd like, you can post a > link to the pcap capture file > >> and one of us can take a look, but I've found > NFS RPCs to be surprisingly > >> readable in Wireshark so you might find it sheds > quite a bit of light. > > > > Actually the point was that link() returns success > even though in reality it > > fails. The fstat() was just a workaround to catch this > case and treat link > > count 1 as if link() had failed with EEXIST. After > that I had no more > > problems with locking. > > > > I noticed this first because my dotlocking was failing > to lock files > > properly. I also added fchown() to flush attribute > cache after link() and > > before fstat(), it gives the same link count=1 reply. > > Indeed, and inspection of nfs_vnops.c:nfs_link(): finds: > > 1772 /* > 1773 * Kludge: Map EEXIST => 0 assuming that it > is a reply to a retry. > 1774 */ > 1775 if (error == EEXIST) > 1776 error = 0; > 1777 return (error); > > Neither Linux nor Solaris appears to have this logic in the > client. I assume > this is, as suggested, to work around UDP retransmissions > where the reply is > lost rather than the request. It appears to exist in > revision 1.1 of > nfs_vnops.c, so came in with 4.4BSD in the initial import, > but doesn't appear > in NetBSD so I'm guessing they've removed it. It > could well be we should be > doing the same. I've added Mohan to the CC line in > case he has any input on > this point. > > Robert N M Watson > Computer Laboratory > University of Cambridge
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?698405.85667.qm>