From owner-freebsd-questions Sat Sep 5 23:39:51 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id XAA07972 for freebsd-questions-outgoing; Sat, 5 Sep 1998 23:39:51 -0700 (PDT) (envelope-from owner-freebsd-questions@FreeBSD.ORG) Received: from smtp04.primenet.com (smtp04.primenet.com [206.165.6.134]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id XAA07967; Sat, 5 Sep 1998 23:39:48 -0700 (PDT) (envelope-from tlambert@usr01.primenet.com) Received: (from daemon@localhost) by smtp04.primenet.com (8.8.8/8.8.8) id XAA10251; Sat, 5 Sep 1998 23:39:47 -0700 (MST) Received: from usr01.primenet.com(206.165.6.201) via SMTP by smtp04.primenet.com, id smtpd010243; Sat Sep 5 23:39:44 1998 Received: (from tlambert@localhost) by usr01.primenet.com (8.8.5/8.8.5) id XAA13097; Sat, 5 Sep 1998 23:39:35 -0700 (MST) From: Terry Lambert Message-Id: <199809060639.XAA13097@usr01.primenet.com> Subject: Re: .nfs files, what causes them and why do they hang around? To: jay@oneway.com (Jay) Date: Sun, 6 Sep 1998 06:39:35 +0000 (GMT) Cc: tlambert@primenet.com, mike@smith.net.au, questions@FreeBSD.ORG, hackers@FreeBSD.ORG In-Reply-To: from "Jay" at Sep 5, 98 01:03:31 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > > You are supposed to never crash. > > So they _shouldn't_ exist unless the server (or client) crashes? No. They *should* exist as a means of keeping state. Basically, if you delete something that is in use by you ("you" being an NFS client (you rename it instead of deleting it. This allows you to continue to access the object. It helps here to know what an "open file" is. On a local FS, an open file is an entry in a per process open file table that points to a vnode in a local FS. On a remote FS, an open file is an entry in a per process open file table that points to an nfsnode that proxies a vnode from a remote FS. Consider the case where you open, then unlink a file. On a local FS, the vnode reference count is the count of directory references plus the count of open file references. On a remote FS, the count is *only* the number of directory references. Now say I open a file on an NFS server. The open references are not incremented (that would be state). Each time I access the file, I pass in the NFS "cookie" that tells me the device and the inode on the device to get a vnode, and thereby access the file. Now say I delete the file on the client. If I really deleted it on the server while it was open, I would no longer be able to access the file contents. This is because, when the reference count goes to zero, the vnode is freed. So as a client with a file open, instead of really deleting it, I mark it for deferred delete, and then rename it, such that someone else is unlikely to open it, and thus hols a local reference which doesn't result in a remote reference really being there (being as it's stateless). So I trade an in-core vnode refernce for a rename + an nfsnode reference with a deferred delete. Now say the client dies, and the deferred delete state, which existed as a flag on the NFSnode, was lost. The .nfs file is now there with no deferred delete referring to it. As a result, it is left hanging around. There are other (obvious) circumstances under which it can be left hanging around, as well. In any case, someone needs to clean up after the lost implied state. This is generally a cron job that remove .nfs files after they are "too old". The definition of "too old" is site dependent, and based on may expected usage of deleted-but-open files, generally multiplied by 2. Does this make sense now? > > Apply my LEASE patches, and apply David Greenman's (or was it PHK's) > > NFS vnode locking patches, both posted to -current and -hackers, > > and you should see a much more stable system. > > I don't know if the not-my-patches have been applied to -current, > > but I'd be surprised if they were in 2,2,7-stable. > > I've been combing through the mailing lists and have found related > material, but I'm not sure if what I am finding is correct. I found a > patch called 'DIFF.LEASE', are these the LEASE patches you refer to? And > if they are, I did not notice any version information, Can I apply these > to a 2.2.6 system? or should I upgrade the system to 2.2.7 first? These are expected to be applied to a 3.0 system prior to February of 1998. After that, the patch for the execution class loader is no longer applicable, Basically, the LEASE code is "opportunity locking". For this to work, it has to be obeyed on every read or write. In the -current (and the 2.2.x code), it's not obeyed. This means that there may be stale VM objects. The idea is that you must obtain a read lease for every read and a write lease for every write. If you don't do this, NFSv3, which depends on lease notifications, will be flakey, and you sould use NFSv2 instead. > Also, I can't find the vnode locking patches at all, save a reference to > something done in '95. See David Greenman's patches for mmap'ed NFS files. This *is* in the -hackers archive for 2.2.6. Basically, the vnode is not locked before areference that requires locking to be reliable. > Are these problems fixed in the 3.0 branch? I don't know. You will have to ask David Greenman. I'm pretty sure my LEASE patches were considered low priority in light of other more serious problems in the code. > Thanks for your assistance so far, and for any additional help you can > give. No problem... I'd like to see FreeBSD be as reliable as possible. It doesn't suit my purpose for participation, otherwise. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message