From owner-freebsd-arch Sun Mar 17 13:37: 2 2002 Delivered-To: freebsd-arch@freebsd.org Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101]) by hub.freebsd.org (Postfix) with ESMTP id EAF8337B404; Sun, 17 Mar 2002 13:36:58 -0800 (PST) Received: from dan.emsphone.com (dan@localhost [127.0.0.1]) by dan.emsphone.com (8.12.2/8.12.2) with ESMTP id g2HLavph074648; Sun, 17 Mar 2002 15:36:58 -0600 (CST) (envelope-from dan@dan.emsphone.com) Received: (from dan@localhost) by dan.emsphone.com (8.12.2/8.12.2/Submit) id g2HLau11074640; Sun, 17 Mar 2002 15:36:56 -0600 (CST) Date: Sun, 17 Mar 2002 15:36:56 -0600 From: Dan Nelson To: John De Boskey Cc: Dag-Erling Smorgrav , Arch List Subject: Re: ftpd ESTALE recovery patch Message-ID: <20020317213656.GC32545@dan.emsphone.com> References: <20020317084153.A3942@FreeBSD.org> <20020317153656.A9003@bsdwins.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20020317153656.A9003@bsdwins.com> User-Agent: Mutt/1.3.27i X-OS: FreeBSD 5.0-CURRENT X-message-flag: Outlook Error Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In the last episode (Mar 17), John De Boskey said: > ----- Dag-Erling Smorgrav's Original Message ----- > > John De Boskey writes: > > > In a busy cluster, a generated file being handed out by ftp is > > > failing due to an ESTALE condition. The following patch fixes the > > > problem. Failure to open the file is also logged when -l is > > > specified twice (see ftpd(8)). > > > > I don't see the point of this. The problem you are experiencing is > > probably caused by invalid assumptions in your setup, though I > > can't comment further without more details about what, exactly, you > > are trying to do. > > Here's a timeline: > > T(0) - On machine A - create new file in /tmp (/tmp/file) > T(1) - On machine A - cp newfile nfsserver:/path/file.new > T(2) - On machine A - mv nfsserver:/path/file.new nfsserver:/path/file > T(3) - On machine B - ftp connection received > get nfsserver:/path/file > (get fails randomly without patch) > > where Time(3) is guarenteed to be greater than Time(4), though > the delta between them can be approaching (but not equal to) > zero. You mean T(2) and T(3), right? > Basically, we have work nodes 1 through 28, using 2 netapp > fileservers for data storage. As we continue to increase the > throughput capabilities of the system, the ESTALE return happens > more consistently. Ftpd is not the 1st tool we've had to fix. Sounds like we've got a bug in either the NFS client or the server, then. When B opens the file, it is already in its final position on the server (at least it should be - is 'mv' a synchronous operation when using NFSv3?), so there should be no stale NFS info. What happens if you sysctl vfs.nfs.access_cache_timeout=0 on the client ? -- Dan Nelson dnelson@allantgroup.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message