Date: Mon, 09 Feb 2015 14:19:20 +0100 From: Michelle Sullivan <michelle@sorbs.net> To: Stefan Esser <se@freebsd.org>, "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org> Subject: Re: ZFS pool faulted (corrupt metadata) but the disk data appears ok... Message-ID: <54D8B3D8.6000804@sorbs.net> In-Reply-To: <54D4BB5A.30409@freebsd.org> References: <54D3E9F6.20702@sorbs.net> <54D41608.50306@delphij.net> <54D41AAA.6070303@sorbs.net> <54D41C52.1020003@delphij.net> <54D424F0.9080301@sorbs.net> <54D47F94.9020404@freebsd.org> <54D4A552.7050502@sorbs.net> <54D4BB5A.30409@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Stefan Esser wrote: > > The point were zdb seg faults hints at the data structure that is > corrupt. You may get some output before the seg fault, if you add > a number of -v options (they add up to higher verbosity). > > Else, you may be able to look at the core and identify the function > that fails. You'll most probably need zdb and libzfs compiled with > "-g" to get any useful information from the core, though. > > For my failed pool, I noticed that internal assumptions were > violated, due to some free space occuring in more than one entry. > I had to special case the test in some function to ignore this > situation (I knew that I'd only ever wanted to mount that pool > R/O to rescue my data). But skipping the test did not suffice, > since another assert triggered (after skipping the NULL dereference, > the calculated sum of free space did not match the recorded sum, I > had to disable that assert, too). With these two patches I was able > to recover the pool starting at a TXG less than 100 transactions back, > which was sufficient for my purpose ... > Question is will zdb 'fix' things or is it just a debug utility (for displaying)? If it is just a debug and won't fix anything, I'm quite happy to roll back transactions, question is how (presumably after one finds the corrupt point - I'm quite happy to just do it by hand until I get success - it will save 2+months of work - I did get an output with a date/time that indicates where the rollback would go to...) In the mean time this appears to be working without crashing - it's been running days now... PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 4332 root 209 22 0 23770M 23277M uwait 1 549:07 11.04% zdb -AAA -L -uhdi -FX -e storage Michelle -- Michelle Sullivan http://www.mhix.org/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?54D8B3D8.6000804>