From owner-freebsd-current@FreeBSD.ORG Thu Aug 28 05:54:35 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 927E216A4BF for ; Thu, 28 Aug 2003 05:54:35 -0700 (PDT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 976D143F75 for ; Thu, 28 Aug 2003 05:54:34 -0700 (PDT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.9/8.12.9) with ESMTP id h7SCs8rO034403; Thu, 28 Aug 2003 08:54:08 -0400 (EDT) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)h7SCs830034400; Thu, 28 Aug 2003 08:54:08 -0400 (EDT) Date: Thu, 28 Aug 2003 08:54:07 -0400 (EDT) From: Robert Watson X-Sender: robert@fledge.watson.org To: Pawel Worach In-Reply-To: <3F4CD409.5080703@telia.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-current@freebsd.org Subject: Re: nfs tranfers hang in state getblck or nfsread X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Aug 2003 12:54:35 -0000 On Wed, 27 Aug 2003, Pawel Worach wrote: > I get the errors every time the nfs mounts are not unmounted "cleanly", > if the client (which is a laptop and i often forget to plug in the power > so the battery dies) dies and the server is rebooted the client boots > fine, i.e. no "nfs server not responding errors". So it looks like there > is some kind of state mismatch in the nfs server code. Ok, so let me see if I have the sequence of events straight: (1) Boot a 4.8-RELEASE/STABLE NFS server (2) Boot a 5.1-RELEASE/CURRENT NFS client (3) Mount a file system using TCP NFSv3 (4) Reboot the client system, reboot, and remount (5) Thrash the file system a bit with large reads/writes, and it hangs Is this correct? I'd like to work out the minimum sequence of events necessary to cause the problem. Is (4) necessary to reproduce the hang, or can you cause it without (4) if you wait long enough? You mention a server reboot here, also, so I want to make sure I'm not confused about the steps to hit the problem. Also, could you try enabling the all.log entry in syslogd, and looking for messages that read something like "nfs send error" in it after this has happened? Once the hang is occuring on the client, can you drop into DDB and do a ps, and in particular, paste into an e-mail any lines about nfsiod threads, and any threads that are blocked in nfs? Likewise, on the server, could you drop into DDB and do a ps, and paste in the state of any nfsd threads? > rc.conf parameters look like this: server: rpcbind_enable="YES" > nfs_server_enable="YES" mountd_enable="YES" > nfs_reserved_port_only="YES" rpc_lockd_enable="YES" > rpc_statd_enable="YES" client: rpcbind_enable="YES" > nfs_client_enable="YES" rpc_lockd_enable="YES" rpc_statd_enable="YES" For kicks, try disabling rpc.lockd on all sides, as well as rpc.statd. I don't think they're involved here, but it's worth disabling them to be sure. Thanks, Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories