From owner-freebsd-fs@FreeBSD.ORG Tue May 4 14:19:25 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id ABE091065674 for ; Tue, 4 May 2010 14:19:25 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 612A18FC20 for ; Tue, 4 May 2010 14:19:25 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AvsEADvJ30uDaFvK/2dsb2JhbACdMXG8M4UTBI88 X-IronPort-AV: E=Sophos;i="4.52,327,1270440000"; d="scan'208";a="74767559" Received: from fraser.cs.uoguelph.ca ([131.104.91.202]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 04 May 2010 10:19:24 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by fraser.cs.uoguelph.ca (Postfix) with ESMTP id 56BD6109C318; Tue, 4 May 2010 10:19:24 -0400 (EDT) X-Virus-Scanned: amavisd-new at fraser.cs.uoguelph.ca Received: from fraser.cs.uoguelph.ca ([127.0.0.1]) by localhost (fraser.cs.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zh7eoXHBkqUd; Tue, 4 May 2010 10:19:23 -0400 (EDT) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102]) by fraser.cs.uoguelph.ca (Postfix) with ESMTP id 94F35109C30C; Tue, 4 May 2010 10:19:23 -0400 (EDT) Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id o44EY2916756; Tue, 4 May 2010 10:34:03 -0400 (EDT) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Tue, 4 May 2010 10:34:02 -0400 (EDT) From: Rick Macklem X-X-Sender: rmacklem@muncher.cs.uoguelph.ca To: Cheng-Lin Yang In-Reply-To: <1272960060.34062.yuwen@exodus.cs.ccu.edu.tw> Message-ID: References: <1272960060.34062.yuwen@exodus.cs.ccu.edu.tw> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs , lab Subject: Re: Struggling on NFS problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 May 2010 14:19:25 -0000 On Tue, 4 May 2010, Cheng-Lin Yang wrote: > Dear all, > Currently, we have a NFS server which runs FreeBSD8 with ZFS and few workstations as NFS client (2 * FreeBSD8 amd64 + 1 * FreeBSD7.2 i386 + 2 * Fedora + Debian). We spotted that NFS performs weirdly on FreeBSD clients, which will significantly slow down the system response. The only solution to it is to reboot the clients (Linux client runs smoothly). So we try to use "nfsstat -c" on FreeBSD client to dig into the problem and found strange result (http://pastebin.com/K71qpEDG) : > csie0[~]# nfsstat -c [stuff snipped] > > As you can see, the value of "BioW Hits" is a negative number. Shouldn't it be equal or larger than zero? We have totally no idea on this issue. Please kindly help us on investigating the problem. Any suggestion is extremely welcomed. Thank you. > I suspect that the negative value is just a wrap around (assuming you're on a 32bit arch) and hust means lottsa hits. If that is the case, it suggests a fairly heavy write load, which can be an issue for servers using ZFS (as others have already posted about). There are a # of patches for FreeBSD8.0 related to NFS (one specifically w.r.t. the server using ZFS) at: http://people.freebsd.org/~rmacklem If you are using FreeBSD8.0 for the server, it would be worth trying these patches (they are all independent, in that any of them can be applied, in any order). (If you are using a recent stable/8, then you should already have the patches.) In particular, one of them fixes a case where FreeBSD clients will get stuck looping trying to access a file after it has been deleted on the server, because the server reported EIO instead of ESTALE for this case. If the patches don't help, please try to collect more information from both the slow clients and server. "ps axl" on them all can be useful. Also, you can use "tcpdump -s 0 -w host " to capture traffic between the slow client and server which can be looked at via wireshark. (tcpdump doesn't decode NFS traffic well, but a binary capture from tcpdump goes into wireshark ok and it does understand NFS traffic) If you get to this point, you can email me the "" as an attachment and I can take a look at it. If you look at it, one scenario that is of interest is where the client just keeps retrying the same NFS RPC. Good luck with it and let us know how it goes, rick