From owner-freebsd-fs@FreeBSD.ORG Thu Nov 29 09:47:39 2012 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A032B426 for ; Thu, 29 Nov 2012 09:47:39 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id E4FC88FC12 for ; Thu, 29 Nov 2012 09:47:38 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id LAA08617; Thu, 29 Nov 2012 11:47:26 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1Te0iM-0000qh-GY; Thu, 29 Nov 2012 11:47:26 +0200 Message-ID: <50B72F2E.6080808@FreeBSD.org> Date: Thu, 29 Nov 2012 11:47:26 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: Raymond Jimenez Subject: Re: ZFS kernel panics due to corrupt DVAs (despite RAIDZ) References: <50B3E680.8060606@caltech.edu> <50B49F6A.2020509@FreeBSD.org> <50B72AFD.3040902@caltech.edu> In-Reply-To: <50B72AFD.3040902@caltech.edu> X-Enigmail-Version: 1.4.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Nov 2012 09:47:40 -0000 on 29/11/2012 11:29 Raymond Jimenez said the following: > Hi Andriy, > > On 11/27/2012 3:09 AM, Andriy Gapon wrote: >> >> Perhaps this thread could be of some interest to you: >> http://thread.gmane.org/gmane.os.freebsd.devel.file-systems/15611/focus=15616 >> > > Thank you for the pointer. Unfortunately, a scrub segfaults in the same > place with the same output. My intention was to show the debugging techniques for examining that troublesome data. >> For one reason or the other wrong data (but correct looking - proper checksums, >> etc) got written to the disk. I'd say use the patch, lift the data and >> re-create the pool. > > Since it's corrupt data coming from higher levels, is there any > possibility of getting this data back? Is it worth debugging more to > add checks to catch this, or are these scenarios be vanishingly > small? I do not have very many scenarios in mind which could lead to problems like that. And that those that I have look very improbable and "uncatchable". E.g. buggy kernel code over-writing some portion of a data buffer. Or some "rogue" DMA transaction doing the same. -- Andriy Gapon