From owner-freebsd-stable@FreeBSD.ORG Thu Dec 13 10:36:58 2012 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B2B0EC8; Thu, 13 Dec 2012 10:36:58 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id C6BA18FC0C; Thu, 13 Dec 2012 10:36:57 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id MAA13845; Thu, 13 Dec 2012 12:36:54 +0200 (EET) (envelope-from avg@FreeBSD.org) Message-ID: <50C9AFC6.6080902@FreeBSD.org> Date: Thu, 13 Dec 2012 12:36:54 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: olivier olivier Subject: Re: NFS/ZFS hangs after upgrading from 9.0-RELEASE to -STABLE References: In-Reply-To: X-Enigmail-Version: 1.4.6 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org, freebsd-stable@FreeBSD.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Dec 2012 10:36:58 -0000 I decided to share here the comment that I made in private, so that more people could potentially benefit from it. on 03/12/2012 20:41 olivier olivier said the following: > Hi all > After upgrading from 9.0-RELEASE to 9.1-PRERELEASE #0 r243679 I'm having > severe problems with NFS sharing of a ZFS volume. nfsd appears to hang at > random times (between once every couple hours to once every two days) while > accessing a ZFS volume, and the only way I have found of resolving the > problem is to reboot. The server console is sometimes still responsive > during the nfsd hang, and I can read and write files to the same ZFS volume > while nfsd is hung. I am pasting below the output of procstat -kk on nfsd, > and details of my pool (nfsstat on the server gets hung when the problem > has started occurring, and does not produce any output). The pool is v28 > and was created from a bunch of volumes attached over Fibre Channel using > the mpt driver. My system has a Supermicro board and 4 AMD Opteron 6274 > CPUs. > > I did not experience any nfsd hangs with 9.0-RELEASE (same machine, > essentially same configuration, same usage pattern). > > I would greatly appreciate any help to resolve this problem! I've looked at the provided data and I do not see anything that implicates ZFS. My rules of the thumb for ZFS hangs: - if there are threads in zio_wait - if you can firm that they are indeed stuck there[*] - if there are no threads in zio_interrupt [*] you have to be sure that a thread just sits in zio_wait and doesn't make any forward progress as opposed to the thread doing a lot of I/O and thus having a high probability of being seen in zio_wait. Then it is most likely that the problem is at the storage level. Most likely it is a bug in storage controller driver which allowed an I/O request to get lost (instead of "errored out" or timed out). `camcontrol tags -v` can be used to query depth of a queue for each disk and determine the bad one. -- Andriy Gapon