From owner-freebsd-stable@FreeBSD.ORG Fri May 21 15:21:16 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4C8D2106564A for ; Fri, 21 May 2010 15:21:16 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta03.emeryville.ca.mail.comcast.net (qmta03.emeryville.ca.mail.comcast.net [76.96.30.32]) by mx1.freebsd.org (Postfix) with ESMTP id 34C228FC23 for ; Fri, 21 May 2010 15:21:15 +0000 (UTC) Received: from omta16.emeryville.ca.mail.comcast.net ([76.96.30.72]) by qmta03.emeryville.ca.mail.comcast.net with comcast id LEdL1e0071ZMdJ4A3FMGYQ; Fri, 21 May 2010 15:21:16 +0000 Received: from koitsu.dyndns.org ([98.248.46.159]) by omta16.emeryville.ca.mail.comcast.net with comcast id LFMG1e0083S48mS8cFMGqH; Fri, 21 May 2010 15:21:16 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id D7C949B419; Fri, 21 May 2010 08:21:14 -0700 (PDT) Date: Fri, 21 May 2010 08:21:14 -0700 From: Jeremy Chadwick To: Mark Morley Message-ID: <20100521152114.GA52102@icarus.home.lan> References: <20100521145554.E5A3D1065670@hub.freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20100521145554.E5A3D1065670@hub.freebsd.org> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: freebsd-stable@freebsd.org Subject: Re: NFS trouble on 7.3-STABLE i386 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 May 2010 15:21:16 -0000 On Fri, May 21, 2010 at 07:45:47AM -0700, Mark Morley wrote: > Having an issue with a file server here (7.3-STABLE i386) > > The nfsd processes are hanging. Client access to the nfs shares stops > working and the nfsd processes on the server cannot be killed by any > means. There are no errors showing up anywhere on the server. The > network connection to the server seems fine (ie: anything other than > nfs traffic seems ok). Rebooting the server fixes the problem for a > while, but it doesn't reboot easily. It times out on terminating the > nfsd processes. When it finally does reboot the file system isn't > marked clean, resulting in a long wait for fsck (although it doesn't > find any problems, it's a multi terrabyte share and it takes a while). I can't explain the dirty filesystem problem, especially if the server does reboot/shut down properly. > This morning it did it again. This time I tried manually killing nfsd > but nothing I did would make them die. No errors. > > ... > > Any thoughts? 1) Are you forcing TCP or UDP NFS, or just using the default? 2) Is RPC still working? Try running rpcinfo on both the client and server. 3) Using rpcinfo and netstat, figure out what TCP or UDP port the client is communicating with on the server, then use tcpdump to sniff traffic on both the client and server specific to those port numbers and see if there's any network I/O happening. 4) On the server, ktrace -t + -p {nfsd-pid} (I'm not sure which of the two (master vs. server) though) to see if anything is going on. Rick Macklem probably has some better ideas than these though. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |