From owner-freebsd-questions@FreeBSD.ORG  Mon Mar 22 16:44:06 2010
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 243371065675
	for <freebsd-questions@freebsd.org>;
	Mon, 22 Mar 2010 16:44:06 +0000 (UTC)
	(envelope-from korvus@comcast.net)
Received: from mx04.pub.collaborativefusion.com
	(mx04.pub.collaborativefusion.com [206.210.72.84])
	by mx1.freebsd.org (Postfix) with ESMTP id DE7598FC15
	for <freebsd-questions@freebsd.org>;
	Mon, 22 Mar 2010 16:44:05 +0000 (UTC)
Received: from [192.168.2.164] ([206.210.89.202])
	by mx04.pub.collaborativefusion.com (StrongMail Enterprise
	4.1.1.4(4.1.1.4-47689)); Mon, 22 Mar 2010 12:59:04 -0400
X-VirtualServerGroup: Default
X-MailingID: 00000::00000::00000::00000::::1403
X-SMHeaderMap: mid="X-MailingID"
X-Destination-ID: freebsd-questions@freebsd.org
X-SMFBL: ZnJlZWJzZC1xdWVzdGlvbnNAZnJlZWJzZC5vcmc=
Message-ID: <4BA79E54.5030504@comcast.net>
Date: Mon, 22 Mar 2010 12:44:04 -0400
From: Steve Polyack <korvus@comcast.net>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.1.7) Gecko/20100311 Thunderbird/3.0.1
MIME-Version: 1.0
To: John Baldwin <jhb@freebsd.org>
References: <4BA3613F.4070606@comcast.net> <4BA78444.4040707@comcast.net>
	<4BA7911F.5060905@comcast.net> <201003221200.41607.jhb@freebsd.org>
In-Reply-To: <201003221200.41607.jhb@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org, Rick Macklem <rmacklem@uoguelph.ca>,
	User Questions <freebsd-questions@freebsd.org>, bseklecki@noc.cfi.pgh.pa.us
Subject: Re: FreeBSD NFS client goes into infinite retry loop
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Mar 2010 16:44:06 -0000

On 03/22/10 12:00, John Baldwin wrote:
> On Monday 22 March 2010 11:47:43 am Steve Polyack wrote:
>    
>> On 03/22/10 10:52, Steve Polyack wrote:
>>      
>>> On 3/19/2010 11:27 PM, Rick Macklem wrote:
>>>        
>>>> On Fri, 19 Mar 2010, Steve Polyack wrote:
>>>>
>>>> [good stuff snipped]
>>>>          
>>>>> This makes sense.  According to wireshark, the server is indeed
>>>>> transmitting "Status: NFS3ERR_IO (5)".  Perhaps this should be STALE
>>>>> instead; it sounds more correct than marking it a general IO error.
>>>>> Also, the NFS server is serving its share off of a ZFS filesystem,
>>>>> if it makes any difference.  I suppose ZFS could be talking to the
>>>>> NFS server threads with some mismatched language, but I doubt it.
>>>>>
>>>>>            
>>>> Ok, now I think we're making progress. If VFS_FHTOVP() doesn't return
>>>> ESTALE when the file no longer exists, the NFS server returns whatever
>>>> error it has returned.
>>>>
>>>> So, either VFS_FHTOVP() succeeds after the file has been deleted, which
>>>> would be a problem that needs to be fixed within ZFS
>>>> OR
>>>> ZFS returns an error other than ESTALE when it doesn't exist.
>>>>
>>>> Try the following patch on the server (which just makes any error
>>>> returned by VFS_FHTOVP() into ESTALE) and see if that helps.
>>>>
>>>> --- nfsserver/nfs_srvsubs.c.sav    2010-03-19 22:06:43.000000000 -0400
>>>> +++ nfsserver/nfs_srvsubs.c    2010-03-19 22:07:22.000000000 -0400
>>>> @@ -1127,6 +1127,8 @@
>>>>           }
>>>>       }
>>>>       error = VFS_FHTOVP(mp,&fhp->fh_fid, vpp);
>>>> +    if (error != 0)
>>>> +        error = ESTALE;
>>>>       vfs_unbusy(mp);
>>>>       if (error)
>>>>           goto out;
>>>>
>>>> Please let me know if the patch helps, rick
>>>>
>>>>
>>>>          
>>> The patch seems to fix the bad behavior.  Running with the patch, I
>>> see the following output from my patch (return code of nfs_doio from
>>> within nfsiod):
>>> nfssvc_iod: iod 0 nfs_doio returned errno: 70
>>>
>>> Furthermore, when inspecting the transaction with Wireshark, after
>>> deleting the file on the NFS server it looks like there is only a
>>> single error.  This time there it is a reply to a V3 Lookup call that
>>> contains a status of "NFS3ERR_NOENT (2)" coming from the NFS server.
>>> The client also does not repeatedly try to complete the failed request.
>>>
>>> Any suggestions on the next step here?  Based on what you said it
>>> looks like ZFS is falsely reporting an IO error to VFS instead of
>>> ESTALE / NOENT.  I tried looking around zfs_fhtovp() and only saw
>>> returns of EINVAL, but I'm not even sure I'm looking in the right place.
>>>        
>> Further on down the rabbit hole... here's the piece in zfs_fhtovp()
>> where it's kicking out EINVAL instead of ESTALE - the following patch
>> corrects the behavior, but of course also suggests further digging
>> within the zfs_zget() function to ensure that _it_ is returning the
>> correct thing and whether or not it needs to be handled there or within
>> zfs_fhtovp().
>>
>> ---
>> src-orig/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
>> 2010-03-22 11:41:21.000000000 -0400
>> +++ src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
>> 2010-03-22 16:25:21.000000000 -0400
>> @@ -1246,7 +1246,7 @@
>>        dprintf("getting %llu [%u mask %llx]\n", object, fid_gen, gen_mask);
>>        if (err = zfs_zget(zfsvfs, object,&zp)) {
>>            ZFS_EXIT(zfsvfs);
>> -        return (err);
>> +        return (ESTALE);
>>        }
>>        zp_gen = zp->z_phys->zp_gen&  gen_mask;
>>        if (zp_gen == 0)
>>      
> So the odd thing here is that ffs_fhtovp() doesn't return ESTALE if VFS_VGET()
> (which calls ffs_vget()) fails, it only returns ESTALE if the generation count
> doesn't matter.
>
>    
It looks like it also returns ESTALE when the inode is invalid (< 
ROOTINO || > max inodes?) - would an unlinked file in FFS referenced at 
a later time report an invalid inode?

But back to your point, zfs_zget() seems to be failing and returning the 
EINVAL before zfs_fhtovp() even has a chance to set and check zp_gen.  
I'm trying to get some more details through the use of gratuitous 
dprintf()'s, but they don't seem to be making it to any logs or the 
console even with vfs.zfs.debug=1 set.  Any pointers on how to get these 
dprintf() calls working?

Thanks again.