From owner-freebsd-stable@FreeBSD.ORG  Thu Dec 13 17:46:46 2012
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 79D7DF1E;
 Thu, 13 Dec 2012 17:46:46 +0000 (UTC)
 (envelope-from olivier777a7@gmail.com)
Received: from mail-da0-f54.google.com (mail-da0-f54.google.com
 [209.85.210.54])
 by mx1.freebsd.org (Postfix) with ESMTP id 3AF7C8FC0A;
 Thu, 13 Dec 2012 17:46:46 +0000 (UTC)
Received: by mail-da0-f54.google.com with SMTP id n2so879137dad.13
 for <multiple recipients>; Thu, 13 Dec 2012 09:46:46 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=d63E96/6160OLuVcKEgRLI9BksImtpZrzihnFDIb9Cs=;
 b=JjQxTfDP4AKXnqM/YL2NgjD74nFs31qT8WyfTYQWvjNVy/OnbTeP32797INyCx2PEX
 NoHUnRP33qVuEaLGwyfftnK8mEruGxUu0XcUSGweYk/IlQnN6CRZq5TFOXVI1iHkTfmq
 R2mfsec+2vusnipQwCpkEppaoImHrMJL9nL4mss6jeOuPmL/Zp4TgDwJ4SNE468kfscg
 ZZtTtR31awZc6+fMZatF9teTCX4m7mvyD49MFgH865TpAPsu9xlCcQKOAsHqT7JOXjB7
 IFpWQKkYYCR65u7nUUmZC+U8owQYMDFtCKOQZEX+e3UtppvUluGIqTjuVnwmdHToUW1R
 /5pw==
MIME-Version: 1.0
Received: by 10.68.204.103 with SMTP id kx7mr7773087pbc.33.1355420795250; Thu,
 13 Dec 2012 09:46:35 -0800 (PST)
Received: by 10.66.148.136 with HTTP; Thu, 13 Dec 2012 09:46:35 -0800 (PST)
In-Reply-To: <50C9AFC6.6080902@FreeBSD.org>
References: <CALC5+1Ptc=c_hxfc_On9iDN4AC_Xmrfdbc1NgyJH2ZxP6fE0Aw@mail.gmail.com>
 <50C9AFC6.6080902@FreeBSD.org>
Date: Thu, 13 Dec 2012 09:46:35 -0800
Message-ID: <CALC5+1MRurpbznOYrnE+K+=BEuj80iqJUbYkLN7SKFwtKqbE1Q@mail.gmail.com>
Subject: Re: NFS/ZFS hangs after upgrading from 9.0-RELEASE to -STABLE
From: olivier <olivier777a7@gmail.com>
To: Andriy Gapon <avg@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Dec 2012 17:46:46 -0000

Thanks. I'll be sure to follow your suggestions next time this happens.

I have a naive question/suggestion though. I see from browsing past
discussions on ZFS problems that it has been suggested a number of times
that problems that appear to originate in ZFS in fact come from lower
layers; in particular because of driver bugs or disks in the process of
failing. It seems that it can take a lot of time to troubleshoot such
problems. I accept that ZFS behavior correctly leaves dealing with timeouts
to lower layers, but it seems to me that the ZFS layer would be a great
place to warn the user about issues and provide some information to
troubleshoot them.

For example, if some I/O requests get lost because of a buggy driver, the
driver itself might not be the best place to identify those lost requests.
But perhaps we could have a compile time option in ZFS code that spits out
a warning if it gets stuck waiting for a particular request to come back
for more than say 10 seconds, and identifies the problematic disk? I'm sure
there would be cases where these warnings would be unwarranted, and I
imagine that changes in the code to provide such warnings would impact
performance; so one certainly would not want that code active by default.
But someone in my position could certainly recompile the kernel with a ZFS
debugging option turned on to figure out the problem.

I understand that ZFS code comes from upstream, and that you guys probably
want to keep FreeBSD-specific changes minimal. If that's a big problem,
even just a patch provided "as such" that does not make it into the FreeBSD
code base might be extremely useful. I wish I could help write something
like that, but I know very little about the kernel or ZFS. I would
certainly be willing to help with testing.

Just my 2 cents worth.
Thanks for the help
Olivier

On Thu, Dec 13, 2012 at 2:36 AM, Andriy Gapon <avg@freebsd.org> wrote:

>
> I decided to share here the comment that I made in private, so that more
> people
> could potentially benefit from it.
>
> on 03/12/2012 20:41 olivier olivier said the following:
> > Hi all
> > After upgrading from 9.0-RELEASE to 9.1-PRERELEASE #0 r243679 I'm having
> > severe problems with NFS sharing of a ZFS volume. nfsd appears to hang at
> > random times (between once every couple hours to once every two days)
> while
> > accessing a ZFS volume, and the only way I have found of resolving the
> > problem is to reboot. The server console is sometimes still responsive
> > during the nfsd hang, and I can read and write files to the same ZFS
> volume
> > while nfsd is hung. I am pasting below the output of procstat -kk on
> nfsd,
> > and details of my pool (nfsstat on the server gets hung when the problem
> > has started occurring, and does not produce any output). The pool is v28
> > and was created from a bunch of volumes attached over Fibre Channel using
> > the mpt driver. My system has a Supermicro board and 4 AMD Opteron 6274
> > CPUs.
> >
> > I did not experience any nfsd hangs with 9.0-RELEASE (same machine,
> > essentially same configuration, same usage pattern).
> >
> > I would greatly appreciate any help to resolve this problem!
>
>
> I've looked at the provided data and I do not see anything that implicates
> ZFS.
> My rules of the thumb for ZFS hangs:
> - if there are threads in zio_wait
> - if you can firm that they are indeed stuck there[*]
> - if there are no threads in zio_interrupt
>
> [*] you have to be sure that a thread just sits in zio_wait and doesn't
> make any
> forward progress as opposed to the thread doing a lot of I/O and thus
> having a
> high probability of being seen in zio_wait.
>
> Then it is most likely that the problem is at the storage level.
> Most likely it is a bug in storage controller driver which allowed an I/O
> request
> to get lost (instead of "errored out" or timed out).
>
> `camcontrol tags <disk> -v` can be used to query depth of a queue for each
> disk
> and determine the bad one.
>
> --
> Andriy Gapon
>