From owner-freebsd-arch Mon Mar 18 8: 3:43 2002 Delivered-To: freebsd-arch@freebsd.org Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by hub.freebsd.org (Postfix) with ESMTP id 4B65B37B404; Mon, 18 Mar 2002 08:03:36 -0800 (PST) Received: from fledge.watson.org (fledge.pr.watson.org [192.0.2.3]) by fledge.watson.org (8.11.6/8.11.6) with SMTP id g2IG3Zk26681; Mon, 18 Mar 2002 11:03:35 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Mon, 18 Mar 2002 11:03:34 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: John De Boskey Cc: Arch List Subject: Re: ftpd ESTALE recovery patch In-Reply-To: <20020317084153.A3942@FreeBSD.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Sun, 17 Mar 2002, John De Boskey wrote: > In a busy cluster, a generated file being handed out by ftp is > failing due to an ESTALE condition. The following patch fixes the > problem. Failure to open the file is also logged when -l is specified > twice (see ftpd(8)). Generally speaking, ESTALE means the file you're trying to access no longer exists (stale file handle). This occurs in NFS unlike local filesystems due to last-close-removes semantics, as opposed to remove-revokes-access semantics. This is a fatal error condition, not a transient one--by definition, the file you get on a retry will be a different file than the one you started accessing before. It may be that the higher level stream abstractions experience ESTALE more often during open than the lower level fd abstractions as they may generate more NFS RPCs during the open phase, increasing the window for file deletion during open. I would have thought that for large files, you'd be much more likely to get them later, where ESTALE would mean you couldn't consistently return the file contents through a retry, I'm a little surprised you catch so many in open (a very small file?) Reasonable work-arounds include: (1) Copying the file to the client (guarantee that ESTALE can't happen during the transfer itself) (2) Preserve the file for a window on the server, by renaming using rename() in the same directory/file system, so that ESTALE is less likely if the transfer happens in an expected amount of time. NFS file handles persist across server renames assuming they are on the same file system. This accepts the reality of ESTALE but reduces the cost of implementing the feature by constraining the change to the server. Robert N M Watson FreeBSD Core Team, TrustedBSD Project robert@fledge.watson.org NAI Labs, Safeport Network Services To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message