Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 17 Mar 2002 15:36:56 -0600
From:      Dan Nelson <dnelson@allantgroup.com>
To:        John De Boskey <jwd@FreeBSD.ORG>
Cc:        Dag-Erling Smorgrav <des@ofug.org>, Arch List <freebsd-arch@FreeBSD.ORG>
Subject:   Re: ftpd ESTALE recovery patch
Message-ID:  <20020317213656.GC32545@dan.emsphone.com>
In-Reply-To: <20020317153656.A9003@bsdwins.com>
References:  <20020317084153.A3942@FreeBSD.org> <xzpadt7rpiu.fsf@flood.ping.uio.no> <20020317153656.A9003@bsdwins.com>

next in thread | previous in thread | raw e-mail | index | archive | help
In the last episode (Mar 17), John De Boskey said:
> ----- Dag-Erling Smorgrav's Original Message -----
> > John De Boskey <jwd@FreeBSD.org> writes:
> > >    In a busy cluster, a generated file being handed out by ftp is
> > > failing due to an ESTALE condition. The following patch fixes the
> > > problem. Failure to open the file is also logged when -l is
> > > specified twice (see ftpd(8)).
> > 
> > I don't see the point of this.  The problem you are experiencing is
> > probably caused by invalid assumptions in your setup, though I
> > can't comment further without more details about what, exactly, you
> > are trying to do.
> 
> Here's a timeline:
> 
> T(0) - On machine A - create new file in /tmp (/tmp/file)
> T(1) - On machine A - cp newfile nfsserver:/path/file.new
> T(2) - On machine A - mv nfsserver:/path/file.new nfsserver:/path/file
> T(3) - On machine B - ftp connection received
>                       get nfsserver:/path/file
>                       (get fails randomly without patch)
> 
> where Time(3) is guarenteed to be greater than Time(4), though
> the delta between them can be approaching (but not equal to)
> zero.

You mean T(2) and T(3), right?
 
> Basically, we have work nodes 1 through 28, using 2 netapp
> fileservers for data storage. As we continue to increase the
> throughput capabilities of the system, the ESTALE return happens
> more consistently. Ftpd is not the 1st tool we've had to fix.

Sounds like we've got a bug in either the NFS client or the server,
then.  When B opens the file, it is already in its final position on
the server (at least it should be - is 'mv' a synchronous operation
when using NFSv3?), so there should be no stale NFS info.

What happens if you sysctl vfs.nfs.access_cache_timeout=0 on the client ?

-- 
	Dan Nelson
	dnelson@allantgroup.com

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020317213656.GC32545>