From owner-freebsd-questions  Sat Sep  5 23:39:51 1998
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Received: (from majordom@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id XAA07972
          for freebsd-questions-outgoing; Sat, 5 Sep 1998 23:39:51 -0700 (PDT)
          (envelope-from owner-freebsd-questions@FreeBSD.ORG)
Received: from smtp04.primenet.com (smtp04.primenet.com [206.165.6.134])
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id XAA07967;
          Sat, 5 Sep 1998 23:39:48 -0700 (PDT)
          (envelope-from tlambert@usr01.primenet.com)
Received: (from daemon@localhost)
	by smtp04.primenet.com (8.8.8/8.8.8) id XAA10251;
	Sat, 5 Sep 1998 23:39:47 -0700 (MST)
Received: from usr01.primenet.com(206.165.6.201)
 via SMTP by smtp04.primenet.com, id smtpd010243; Sat Sep  5 23:39:44 1998
Received: (from tlambert@localhost)
	by usr01.primenet.com (8.8.5/8.8.5) id XAA13097;
	Sat, 5 Sep 1998 23:39:35 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <199809060639.XAA13097@usr01.primenet.com>
Subject: Re: .nfs files, what causes them and why do they hang around?
To: jay@oneway.com (Jay)
Date: Sun, 6 Sep 1998 06:39:35 +0000 (GMT)
Cc: tlambert@primenet.com, mike@smith.net.au, questions@FreeBSD.ORG,
        hackers@FreeBSD.ORG
In-Reply-To: <Pine.BSF.4.02.9809051252490.24660-100000@tidal.oneway.com> from "Jay" at Sep 5, 98 01:03:31 pm
X-Mailer: ELM [version 2.4 PL25]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-questions@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> > You are supposed to never crash.
> 
>      So they _shouldn't_ exist unless the server (or client) crashes?

No.  They *should* exist as a means of keeping state.

Basically, if you delete something that is in use by you ("you" being
an NFS client (you rename it instead of deleting it.  This allows
you to continue to access the object.

It helps here to know what an "open file" is.

On a local FS, an open file is an entry in a per process open file
table that points to a vnode in a local FS.

On a remote FS, an open file is an entry in a per process open file
table that points to an nfsnode that proxies a vnode from a remote
FS.

Consider the case where you open, then unlink a file.

On a local FS, the vnode reference count is the count of directory
references plus the count of open file references.

On a remote FS, the count is *only* the number of directory references.

Now say I open a file on an NFS server.  The open references are not
incremented (that would be state).  Each time I access the file, I
pass in the NFS "cookie" that tells me the device and the inode on the
device to get a vnode, and thereby access the file.

Now say I delete the file on the client.  If I really deleted it on the
server while it was open, I would no longer be able to access the file
contents.  This is because, when the reference count goes to zero, the
vnode is freed.

So as a client with a file open, instead of really deleting it, I mark
it for deferred delete, and then rename it, such that someone else is
unlikely to open it, and thus hols a local reference which doesn't
result in a remote reference really being there (being as it's stateless).

So I trade an in-core vnode refernce for a rename + an nfsnode reference
with a deferred delete.


Now say the client dies, and the deferred delete state, which existed
as a flag on the NFSnode, was lost.

The .nfs file is now there with no deferred delete referring to it.

As a result, it is left hanging around.

There are other (obvious) circumstances under which it can be left
hanging around, as well.

In any case, someone needs to clean up after the lost implied state.

This is generally a cron job that remove .nfs files after they are
"too old".  The definition of "too old" is site dependent, and based
on may expected usage of deleted-but-open files, generally multiplied
by 2.

Does this make sense now?


> > Apply my LEASE patches, and apply David Greenman's (or was it PHK's)
> > NFS vnode locking patches, both posted to -current and -hackers,
> > and you should see a much more stable system.
> > I don't know if the not-my-patches have been applied to -current,
> > but I'd be surprised if they were in 2,2,7-stable.
> 
>     I've been combing through the mailing lists and have found related
> material, but I'm not sure if what I am finding is correct.  I found a
> patch called 'DIFF.LEASE', are these the LEASE patches you refer to?  And
> if they are, I did not notice any version information, Can I apply these
> to a 2.2.6 system?  or should I upgrade the system to 2.2.7 first? 

These are expected to be applied to a 3.0 system prior to February of
1998.  After that, the patch for the execution class loader is no longer
applicable,

Basically, the LEASE code is "opportunity locking".  For this to work,
it has to be obeyed on every read or write.  In the -current (and the
2.2.x code), it's not obeyed.  This means that there may be stale VM
objects.

The idea is that you must obtain a read lease for every read and a write
lease for every write.

If you don't do this, NFSv3, which depends on lease notifications, will
be flakey, and you sould use NFSv2 instead.


> Also, I can't find the vnode locking patches at all, save a reference to
> something done in '95.  

See David Greenman's patches for mmap'ed NFS files.  This *is* in the
-hackers archive for 2.2.6.  Basically, the vnode is not locked before
areference that requires locking to be reliable.

>     Are these problems fixed in the 3.0 branch?      

I don't know.  You will have to ask David Greenman.  I'm pretty sure
my LEASE patches were considered low priority in light of other more
serious problems in the code.


>     Thanks for your assistance so far, and for any additional help you can
> give.  

No problem... I'd like to see FreeBSD be as reliable as possible.  It
doesn't suit my purpose for participation, otherwise.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message