From owner-freebsd-questions@FreeBSD.ORG  Sat Mar 20 02:41:03 2010
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id DD526106564A
	for <freebsd-questions@freebsd.org>;
	Sat, 20 Mar 2010 02:41:02 +0000 (UTC)
	(envelope-from korvus@comcast.net)
Received: from qmta02.westchester.pa.mail.comcast.net
	(qmta02.westchester.pa.mail.comcast.net [76.96.62.24])
	by mx1.freebsd.org (Postfix) with ESMTP id 871C88FC21
	for <freebsd-questions@freebsd.org>;
	Sat, 20 Mar 2010 02:41:02 +0000 (UTC)
Received: from omta04.westchester.pa.mail.comcast.net ([76.96.62.35])
	by qmta02.westchester.pa.mail.comcast.net with comcast
	id vEG51d0060ldTLk52ETniK; Sat, 20 Mar 2010 02:27:47 +0000
Received: from [10.0.0.51] ([71.199.122.142])
	by omta04.westchester.pa.mail.comcast.net with comcast
	id vETm1d00334Sj4f3QETn5K; Sat, 20 Mar 2010 02:27:47 +0000
Message-ID: <4BA432C8.4040707@comcast.net>
Date: Fri, 19 Mar 2010 22:28:24 -0400
From: Steve Polyack <korvus@comcast.net>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB;
	rv:1.9.1.8) Gecko/20100227 Lightning/1.0b1 Thunderbird/3.0.3
MIME-Version: 1.0
To: Rick Macklem <rmacklem@uoguelph.ca>
References: <4BA3613F.4070606@comcast.net> <201003190831.00950.jhb@freebsd.org>
	<4BA37AE9.4060806@comcast.net> <4BA392B1.4050107@comcast.net>
	<4BA3DEBC.2000608@comcast.net>
	<Pine.GSO.4.63.1003192120470.17841@muncher.cs.uoguelph.ca>
In-Reply-To: <Pine.GSO.4.63.1003192120470.17841@muncher.cs.uoguelph.ca>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org, bseklecki@noc.cfi.pgh.pa.us,
	User Questions <freebsd-questions@freebsd.org>,
	John Baldwin <jhb@freebsd.org>
Subject: Re: FreeBSD NFS client goes into infinite retry loop
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 20 Mar 2010 02:41:03 -0000

On 3/19/2010 9:32 PM, Rick Macklem wrote:
>
> On Fri, 19 Mar 2010, Steve Polyack wrote:
>
>>
>> To anyone who is interested: I did some poking around with DTrace, 
>> which led me to the nfsiod client code.
>> In src/sys/nfsclient/nfs_nfsiod.c:
>>        } else {
>>            if (bp->b_iocmd == BIO_READ)
>>                (void) nfs_doio(bp->b_vp, bp, bp->b_rcred, NULL);
>>            else
>>                (void) nfs_doio(bp->b_vp, bp, bp->b_wcred, NULL);
>>        }
>>
>
> If you look t nfs_doio(), it decides whether or not to mark the buffer
> invalid, based on the return value it gets. Some (EINTR, ETIMEDOUT, EIO)
> are not considered fatal, but the others are. (When the async I/O
> daemons call nfs_doio(), they are threads that couldn't care less if
> the underlying I/O op succeeded. The outcome of the I/O operation
> determines what nfs_doio() does with the buffer cache block.)

I was looking at this and noticed the above after my last post.

>>
>> The result is that my problematic repeatable circumstance begins 
>> logging "nfssvc_iod: iod 0 nfs_doio returned errno: 5" (corresponding 
>> to NFSERR_INVAL?) for each repetition of the failed write.  The only 
>> things triggering this are my failed writes.  I can also see the 
>> nfsiod0 process waking up each iteration.
>>
>
> Nope, errno 5 is EIO and that's where the problem is. I don't know why
> the server is returning EIO after the file has been deleted on the
> server (I assume you did that when running your little shell script?).

Yes, while running the simple shell script I simply deleted the file on 
the NFS server itself.

>> Do we need some kind of "retry x times then abort" logic within 
>> nfsiod_iod(), or does this belong in the subsequent functions, such 
>> as nfs_doio()?  I think it's best to avoid these sorts of infinite 
>> loops which have the potential to take out the system or overload the 
>> network due to dumb decisions made by unprivileged users.
>>
> Nope, people don't like data not getting written back to a server when
> it is slow or temporarily network partitioned. The only thing that should
> stop a client from retrying a write back to the server is a fatal error
> from the server that says "this won't ever succeed".
>
> I think we need to figure out if the EIO (NFS3ERR_IO in wireshark) or
> if the server is sending NFS3ERR_STALE and the client is somehow munging
> that into EIO, causing the confusion.

This makes sense.  According to wireshark, the server is indeed 
transmitting "Status: NFS3ERR_IO (5)".  Perhaps this should be STALE 
instead; it sounds more correct than marking it a general IO error.  
Also, the NFS server is serving its share off of a ZFS filesystem, if it 
makes any difference.  I suppose ZFS could be talking to the NFS server 
threads with some mismatched language, but I doubt it.

Thanks for the informative response,
Steve