From owner-freebsd-current@FreeBSD.ORG  Thu Aug 28 10:20:07 2003
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 2328616A4BF; Thu, 28 Aug 2003 10:20:07 -0700 (PDT)
Received: from h190n1fls34o809.telia.com (h190n1fls34o809.telia.com
	[213.67.96.190])	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 0A0E343FCB; Thu, 28 Aug 2003 10:20:05 -0700 (PDT)
	(envelope-from pawel.worach@telia.com)
Received: from telia.com (corona.sajd.net [192.168.1.20])
	h7SHK2f20302;	Thu, 28 Aug 2003 19:20:02 +0200 (MEST)
Message-ID: <3F4E39BF.10001@telia.com>
Date: Thu, 28 Aug 2003 19:19:59 +0200
From: Pawel Worach <pawel.worach@telia.com>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.5b) Gecko/20030825
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: Robert Watson <rwatson@freebsd.org>
References: <Pine.NEB.3.96L.1030828084515.34202C-100000@fledge.watson.org>
In-Reply-To: <Pine.NEB.3.96L.1030828084515.34202C-100000@fledge.watson.org>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
cc: freebsd-current@freebsd.org
Subject: Re: nfs tranfers hang in state getblck or nfsread
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Aug 2003 17:20:07 -0000

Robert Watson wrote:
> On Wed, 27 Aug 2003, Pawel Worach wrote:
> 
> Ok, so let me see if I have the sequence of events straight:
> 
> (1) Boot a 4.8-RELEASE/STABLE NFS server
> (2) Boot a 5.1-RELEASE/CURRENT NFS client
> (3) Mount a file system using TCP NFSv3
> (4) Reboot the client system, reboot, and remount
> (5) Thrash the file system a bit with large reads/writes, and it hangs

Not quite, more like this:
1) Boot the 5.1-CURRENT nfs server
2) Boot the 5.1-CURRENT diskless client (i'm using PXE/DHCP)
3) Login and run find(1) for a while on every filesystem.
(e.g. find / ^C ; find /usr ^C ; find /export ^C and so on to
generate some getattr(), read() and c/o calls)
4) Shut down the client in a _non-clean_ way, pull the power
or enter DDB and 'reset'.
5) Boot the diskless client again.

Now here are the messages i get while booting the client (step 5).
(darkstar is the server, corona is the client. the one about mounttab
is present at every boot and is not related to this problem)
Mounting root from nfs:
NFS ROOT: 192.168.1.11:/export/root
start_init: trying /sbin/init
Interface fxp0 IP-Address 192.168.1.20 Broadcast 192.168.1.255
Loading configuration files.
Entropy harversting: interrupts ethernet point_to_point
Starting file system checks:
nfs: can't update /var/db/mounttab for darkstar:/export/root
+++ mount_md of /var
nfs server darkstar:/usr: not responding
<insert about a 10 second delay here>
nfs server darkstar:/usr: is alive again
nfs server darkstar:/usr/home: not responding
<insert about a 20 second delay here>
nfs server darkstar:/usr/home: is alive again
<insert about a 20 second delay here>
[tcp] darkstar:/export: nfsd: RPCPROG_NFS: RPC: Remote system error - Operation 
timed out
<insert about a 80 second delay here>
nfs server darkstar:/export: not responding
<insert about a 40 second delay here>
nfs server darkstar:/export: is alive again

 From here on the boot continues normally and the system works fine.

I'm going to set different mount options for every filesystem now
and do this again so maybe i can nail down what is causing this.
Ths only filesystem that doesn't have problems is / and that is
also the only one using udp.

Hope this is not as confusing as my previus mail :)

And whoever commented about the "magic" stuff, that was a cut-and-paste from the
'dumpfs <fs> | grep UFS' command.

	- Pawel