From owner-freebsd-fs@FreeBSD.ORG Mon Nov 26 22:10:21 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3A1FE8B8 for ; Mon, 26 Nov 2012 22:10:21 +0000 (UTC) (envelope-from raymondj@caltech.edu) Received: from outgoing-mail.its.caltech.edu (outgoing-mail.its.caltech.edu [131.215.239.19]) by mx1.freebsd.org (Postfix) with ESMTP id 17E818FC17 for ; Mon, 26 Nov 2012 22:10:20 +0000 (UTC) Received: from earth-doxen.imss.caltech.edu (localhost [127.0.0.1]) by earth-doxen-postvirus (Postfix) with ESMTP id 71E0066E01C9 for ; Mon, 26 Nov 2012 14:00:53 -0800 (PST) X-Spam-Scanned: at Caltech-IMSS on earth-doxen by amavisd-new Received: from [127.0.0.1] (mitsuki.caltech.edu [131.215.167.33]) (Authenticated sender: raymondj) by earth-doxen-submit (Postfix) with ESMTP id 513AF66E01E5 for ; Mon, 26 Nov 2012 14:00:51 -0800 (PST) Message-ID: <50B3E680.8060606@caltech.edu> Date: Mon, 26 Nov 2012 14:00:32 -0800 From: Raymond Jimenez User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121026 Thunderbird/16.0.2 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: ZFS kernel panics due to corrupt DVAs (despite RAIDZ) Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Nov 2012 22:10:21 -0000 Hello, We recently sent our drives out for data recovery (blown drive electronics), and when we got the new drives/data back, ZFS started to kernel panic whenever listing certain items in a directory, or whenever a scrub is close to finishing (~99.97%) The zpool worked fine before data recovery, and most of the files are accessible (only a couple hundred unavailable out of several million). Here's the kernel panic output if I scrub the disk: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x38 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff810792d1 stack pointer = 0x28:0xffffff8235122720 frame pointer = 0x28:0xffffff8235122750 code segment = base 0x0, limit 0xffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 52 (txg_thread_enter) [thread pid 52 tid 101230 ] Stopped at vdev_is_dead+0x1: cmpq $0x5, 0x38(%rdi) $rdi is zero, so this seems to be just a null pointer exception. The vdev setup looks like: pool: mfs-zpool004 state: ONLINE scan: scrub canceled on Mon Nov 26 05:40:49 2012 config: NAME STATE READ WRITE CKSUM mfs-zpool004 ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 gpt/lenin3-drive8 ONLINE 0 0 0 gpt/lenin3-drive9.eli ONLINE 0 0 0 gpt/lenin3-drive10 ONLINE 0 0 0 gpt/lenin3-drive11.eli ONLINE 0 0 0 raidz1-1 ONLINE 0 0 0 gpt/lenin3-drive12 ONLINE 0 0 0 gpt/lenin3-drive13.eli ONLINE 0 0 0 gpt/lenin3-drive14 ONLINE 0 0 0 gpt/lenin3-drive15.eli ONLINE 0 0 0 errors: No known data errors The initial scrub fixed some data (~24k) in the early stages, but also crashed at 99.97%. Right now, I'm using an interim work-around patch[1] so that our users can get files without worrying about crashing the server. It's a small check in dbuf_findbp() that checks if the DVA that will be returned has a small (=<16) vdev number, and if not, returns EIO. This just results in ZFS returning I/O errors for any of the corrupt files I try to access, which at least lets us get at our data for now. My suspicion is that somehow, bad data is getting interpreted as a block pointer/shift constant, and this sends ZFS into the woods. I haven't been able to track down how this data could get past checksum verification, especially with RAIDZ. Backtraces: (both crashes due to vdev_is_dead() dereferencing a null pointer) Scrub crash: http://wsyntax.com/~raymond/zfs/zfs-scrub-bt.txt Prefetch off, ls -al of "/06/chunk_0000000001417E06_00000001.mfs": http://wsyntax.com/~raymond/zfs/zfs-ls-bt.txt Regards, Raymond Jimenez [1] http://wsyntax.com/~raymond/zfs/zfs-dva-corrupt-workaround.patch