From owner-freebsd-fs@FreeBSD.ORG  Mon Feb  9 13:19:29 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id B15A5703;
 Mon,  9 Feb 2015 13:19:29 +0000 (UTC)
Received: from hades.sorbs.net (hades.sorbs.net [67.231.146.201])
 by mx1.freebsd.org (Postfix) with ESMTP id 9A9ADFA1;
 Mon,  9 Feb 2015 13:19:29 +0000 (UTC)
MIME-version: 1.0
Content-transfer-encoding: 7BIT
Content-type: text/plain; CHARSET=US-ASCII
Received: from isux.com (firewall.isux.com [213.165.190.213])
 by hades.sorbs.net
 (Oracle Communications Messaging Server 7.0.5.29.0 64bit (built Jul 9 2013))
 with ESMTPSA id <0NJI00B9ZAKGAS00@hades.sorbs.net>; Mon,
 09 Feb 2015 05:24:18 -0800 (PST)
Message-id: <54D8B3D8.6000804@sorbs.net>
Date: Mon, 09 Feb 2015 14:19:20 +0100
From: Michelle Sullivan <michelle@sorbs.net>
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.24)
 Gecko/20100301 SeaMonkey/1.1.19
To: Stefan Esser <se@freebsd.org>,
 "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject: Re: ZFS pool faulted (corrupt metadata) but the disk data appears
 ok...
References: <54D3E9F6.20702@sorbs.net> <54D41608.50306@delphij.net>
 <54D41AAA.6070303@sorbs.net> <54D41C52.1020003@delphij.net>
 <54D424F0.9080301@sorbs.net> <54D47F94.9020404@freebsd.org>
 <54D4A552.7050502@sorbs.net> <54D4BB5A.30409@freebsd.org>
In-reply-to: <54D4BB5A.30409@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.18-1
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 09 Feb 2015 13:19:29 -0000

Stefan Esser wrote:
>
> The point were zdb seg faults hints at the data structure that is
> corrupt. You may get some output before the seg fault, if you add
> a number of -v options (they add up to higher verbosity).
>
> Else, you may be able to look at the core and identify the function
> that fails. You'll most probably need zdb and libzfs compiled with
> "-g" to get any useful information from the core, though.
>
> For my failed pool, I noticed that internal assumptions were
> violated, due to some free space occuring in more than one entry.
> I had to special case the test in some function to ignore this
> situation (I knew that I'd only ever wanted to mount that pool
> R/O to rescue my data). But skipping the test did not suffice,
> since another assert triggered (after skipping the NULL dereference,
> the calculated sum of free space did not match the recorded sum, I
> had to disable that assert, too). With these two patches I was able
> to recover the pool starting at a TXG less than 100 transactions back,
> which was sufficient for my purpose ...
>   

Question is will zdb 'fix' things or is it just a debug utility (for
displaying)?

If it is just a debug and won't fix anything, I'm quite happy to roll
back transactions, question is how (presumably after one finds the
corrupt point - I'm quite happy to just do it by hand until I get
success - it will save 2+months of work - I did get an output with a
date/time that indicates where the rollback would go to...)

In the mean time this appears to be working without crashing - it's been
running days now...

  PID USERNAME       THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU
COMMAND
 4332 root           209  22    0 23770M 23277M uwait   1 549:07 11.04%
zdb -AAA -L -uhdi -FX -e storage

Michelle

-- 
Michelle Sullivan
http://www.mhix.org/