From owner-freebsd-hackers@FreeBSD.ORG Thu Jun 19 22:24:58 2003 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 10FF437B401 for ; Thu, 19 Jun 2003 22:24:58 -0700 (PDT) Received: from heron.mail.pas.earthlink.net (heron.mail.pas.earthlink.net [207.217.120.189]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7CECA43FD7 for ; Thu, 19 Jun 2003 22:24:57 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from user-uinj93o.dialup.mindspring.com ([165.121.164.120] helo=mindspring.com) by heron.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 19TEOK-0000p1-00; Thu, 19 Jun 2003 22:24:53 -0700 Message-ID: <3EF29A62.5E91D714@mindspring.com> Date: Thu, 19 Jun 2003 22:23:46 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Andrey Alekseyev References: <200306190955.NAA00538@slt.oz> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4dc282962027a7df9edfc09282b7781e7350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c cc: freebsd-hackers@freebsd.org Subject: Re: open() and ESTALE error X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Jun 2003 05:24:58 -0000 Andrey Alekseyev wrote: > I've been trying lately to develop a solution for the problem with > open() that manifests itself in ESTALE error in the following situation: > > 1. NFS server: echo "1111" > file01 > 2. NFS client: cat file01 > 3. NFS server: echo "2222" > file02 && mv file02 file01 > 4. NFS client: cat file01 (either old file01 contents or ESTALE) > > My study shows that actually the problem appears to be in VOP_ACCESS() > which is called from vn_open(). If nfs_access() decides to "go to the wire" > in #4, it then uses a cached file handle which is indeed stale. Thus, > open() eventually fails with ESTALE too (ESTALE comes from underlying > nfs_request()). The real problem here is that you know you did an operation on the file which would break the name/nfsnode relationship, but did not flush the cached name and nfsnode data. A more correct solution would resync the nfsnode. The main problem with your solution is that it doesn't work in the case that you don't know the name of the remote file (in which case, all you really have is a stale file handle, with no way to unstale it). I think this is a corner case that's probably not really very interesting to solve. Now if you remembered the rename, and applied your knowledge of the rename semantics to the problem, you could replace the handle in the local nfsnode for the file. This would not be as expensive as traversing all of the nfsnodes, since you could use the same hash that's used to translate a fh to a vp to get the vp. This would fix a lot more cases than the single failure you are fixing. In general, though, you can't fix *any* of the cases without introducing a vnode alias for an nfsnode that may have a local alias already: there's no way to handle the hash collision in that case, nor would you want to, since there's no way to deal with the different vnodes that point to the different nfsnodes, and have their own vmobject_t's: no matter how you look at it, you can replace the vnode address in the open file(s) that point to it, so you have to ESTALE. -- Terry