From owner-freebsd-fs@FreeBSD.ORG  Mon Nov 26 22:21:53 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id C9A40C6
 for <freebsd-fs@freebsd.org>; Mon, 26 Nov 2012 22:21:53 +0000 (UTC)
 (envelope-from yanegomi@gmail.com)
Received: from mail-la0-f54.google.com (mail-la0-f54.google.com
 [209.85.215.54])
 by mx1.freebsd.org (Postfix) with ESMTP id 47DA98FC12
 for <freebsd-fs@freebsd.org>; Mon, 26 Nov 2012 22:21:52 +0000 (UTC)
Received: by mail-la0-f54.google.com with SMTP id j13so11205181lah.13
 for <freebsd-fs@freebsd.org>; Mon, 26 Nov 2012 14:21:52 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=TP/qGWgceSySL1zKNubUXepMd5j3tEtxbab43hdLqrc=;
 b=KoDseEuvo9FwNBPQ8Mfye9zTIqLslW9eFzDJVrJ+7Wu/p71dm+O2bwMzlOmi0ryJRk
 qR/wVvXl8ZL+mmqUVZ9FnCx3J5zvkH7MwtEweVu6kxycFVVQMSui+yduMONC8XSYoIOF
 x9mXqlJUQ9bLV6p3G8KIeKk7VCE1Rau3wt0+sWIra/O44RxSUH75mq3w4FZQT4V3txYP
 sq4FlxLcsJinB3I2RWdKA838u0oR2K7ldfp3vHC0nQRbclNSqPdFCFi3RNWc354MZBf5
 7Ep7z/zH5b/a5eHXjT5vXSr1U3yKlPpTrBeIuvRc1rU9bquuIxDrNJ7PAEJTbVx+Ailv
 9EZw==
MIME-Version: 1.0
Received: by 10.152.106.212 with SMTP id gw20mr12616349lab.8.1353968511609;
 Mon, 26 Nov 2012 14:21:51 -0800 (PST)
Received: by 10.112.144.101 with HTTP; Mon, 26 Nov 2012 14:21:51 -0800 (PST)
In-Reply-To: <50B3E680.8060606@caltech.edu>
References: <50B3E680.8060606@caltech.edu>
Date: Mon, 26 Nov 2012 14:21:51 -0800
Message-ID: <CAGH67wRYDg5gagx_Wx0Jji8wwYGgvzuui-yiAe4v8sup_bHzxw@mail.gmail.com>
Subject: Re: ZFS kernel panics due to corrupt DVAs (despite RAIDZ)
From: Garrett Cooper <yanegomi@gmail.com>
To: Raymond Jimenez <raymondj@caltech.edu>
Content-Type: text/plain; charset=ISO-8859-1
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 26 Nov 2012 22:21:53 -0000

On Mon, Nov 26, 2012 at 2:00 PM, Raymond Jimenez <raymondj@caltech.edu> wrote:
> Hello,
>
> We recently sent our drives out for data recovery (blown drive
> electronics), and when we got the new drives/data back, ZFS
> started to kernel panic whenever listing certain items in a
> directory, or whenever a scrub is close to finishing (~99.97%)
>
> The zpool worked fine before data recovery, and most of the
> files are accessible (only a couple hundred unavailable out of
> several million).
>
> Here's the kernel panic output if I scrub the disk:
>
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address  = 0x38
> fault code             = supervisor read data, page not present
> instruction pointer    = 0x20:0xffffffff810792d1
> stack pointer          = 0x28:0xffffff8235122720
> frame pointer          = 0x28:0xffffff8235122750
> code segment           = base 0x0, limit 0xffff, type 0x1b
>                        = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags       = interrupt enabled, resume, IOPL = 0
> current process        = 52 (txg_thread_enter)
> [thread pid 52 tid 101230 ]
> Stopped at vdev_is_dead+0x1: cmpq $0x5, 0x38(%rdi)
>
> $rdi is zero, so this seems to be just a null pointer exception.
>
> The vdev setup looks like:
>
>   pool: mfs-zpool004
>  state: ONLINE
>   scan: scrub canceled on Mon Nov 26 05:40:49 2012
> config:
>
>         NAME                        STATE     READ WRITE CKSUM
>         mfs-zpool004                ONLINE       0     0     0
>           raidz1-0                  ONLINE       0     0     0
>             gpt/lenin3-drive8       ONLINE       0     0     0
>             gpt/lenin3-drive9.eli   ONLINE       0     0     0
>             gpt/lenin3-drive10      ONLINE       0     0     0
>             gpt/lenin3-drive11.eli  ONLINE       0     0     0
>           raidz1-1                  ONLINE       0     0     0
>             gpt/lenin3-drive12      ONLINE       0     0     0
>             gpt/lenin3-drive13.eli  ONLINE       0     0     0
>             gpt/lenin3-drive14      ONLINE       0     0     0
>             gpt/lenin3-drive15.eli  ONLINE       0     0     0
>
> errors: No known data errors
>
> The initial scrub fixed some data (~24k) in the early stages, but
> also crashed at 99.97%.
>
> Right now, I'm using an interim work-around patch[1] so that our
> users can get files without worrying about crashing the server.
> It's a small check in dbuf_findbp() that checks if the DVA that will
> be returned has a small (=<16) vdev number, and if not, returns EIO.
> This just results in ZFS returning I/O errors for any of the corrupt
> files I try to access, which at least lets us get at our data for now.
>
> My suspicion is that somehow, bad data is getting interpreted as
> a block pointer/shift constant, and this sends ZFS into the woods.
> I haven't been able to track down how this data could get past
> checksum verification, especially with RAIDZ.
>
> Backtraces:
>
> (both crashes due to vdev_is_dead() dereferencing a null pointer)
>
> Scrub crash:
> http://wsyntax.com/~raymond/zfs/zfs-scrub-bt.txt
>
> Prefetch off, ls -al of "/06/chunk_0000000001417E06_00000001.mfs":
> http://wsyntax.com/~raymond/zfs/zfs-ls-bt.txt

    This is missing key details like uname, zpool version, etc.
Thanks,
-Garrett