From owner-freebsd-fs@FreeBSD.ORG Tue Nov 10 15:02:21 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EE2C1106568F for ; Tue, 10 Nov 2009 15:02:21 +0000 (UTC) (envelope-from gtodd@bellanet.org) Received: from smtp122.rog.mail.re2.yahoo.com (smtp122.rog.mail.re2.yahoo.com [206.190.53.27]) by mx1.freebsd.org (Postfix) with SMTP id 91DDD8FC2D for ; Tue, 10 Nov 2009 15:02:21 +0000 (UTC) Received: (qmail 83227 invoked from network); 10 Nov 2009 15:02:20 -0000 Received: from CPE0080c8f208a5-CM001371173cf8.cpe.net.cable.rogers.com (gtodd@99.246.61.82 with login) by smtp122.rog.mail.re2.yahoo.com with SMTP; 10 Nov 2009 07:02:20 -0800 PST X-Yahoo-SMTP: nTCfdR6swBBGLC_L2D5qGu5iTT2NdwV8DpHvzb.tTA-- X-YMail-OSG: 2tU2yCIVM1lbQrkfdjg4NQQk6nEXXDT1OTI5qCxsorEsBWqt8ar8A_a0bbx2MWO_og-- X-Yahoo-Newman-Property: ymail-3 Received: from wawanesa.iciti.ca (wawanesa.iciti.ca [192.168.2.4]) by wawanesa.iciti.ca (Postfix) with ESMTP id BEB48100; Tue, 10 Nov 2009 10:01:06 -0500 (EST) Message-ID: <4AF98032.9050808@bellanet.org> Date: Tue, 10 Nov 2009 10:01:06 -0500 From: Graham Todd User-Agent: Thunderbird 2.0.0.21 (X11/20090511) MIME-Version: 1.0 To: =?ISO-8859-1?Q?Gerrit_K=FChn?= References: <20091106094734.4b056899.gerrit@pmp.uni-hannover.de> <4AF4123A.4080301@andric.com> <20091106231440.4f0f2cbb.gerrit@pmp.uni-hannover.de> <4AF4AAFF.2080104@jrv.org> <20091109101255.e81774e4.gerrit@pmp.uni-hannover.de> In-Reply-To: <20091109101255.e81774e4.gerrit@pmp.uni-hannover.de> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit Cc: freebsd-fs@freebsd.org Subject: Re: trace for zfs panic mounting fs after crash with RC2 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 10 Nov 2009 15:02:22 -0000 Gerrit Kühn wrote: > On Fri, 06 Nov 2009 17:02:23 -0600 "James R. Van Artsdalen" > wrote about Re: trace for zfs panic mounting > fs after crash with RC2: > > JRVA> How the ZIL got corrupted - if it did - is a harder question. > > I think it is. Otherwise zfs would not crash while trying to replay the > ZIL, wouldn't it? > It seems that this happens rather easily with the system I have at hand > (it happend twice to me so far - and I crashed the system only twice, > that makes 100%, although I doubt that it is that reproducible). Searching > around I found some reports of the same or similar issues (but no > solution). So apart from recovering my fs (I did not try your suggested > patch yet), there are two things I regard as very important: > > 1. Find you why the ZIL gets corrupted under some circumstances. > 2. Find a safe way to recover a fs with a corrupted ZIL. > > I guess I could live with a corrupted ZIL after a crash, if there was some > kind of --ignore-zil switch to get my data back online. In any case, zfs > should not panic on corrupted ZIL data, should it? Is there is a way to "manually" use zdb to mimic the "zpool clear" command introduced in OpenSolaris's ZFS with PSARC-2009479? http://www.c0t0d0s0.org/archives/6067-PSARC-2009479-zpool-recovery-a.html I have no idea if this would help: in fact it might very well be dangerous for the pool that Gerrit is trying to recover. Are you able to copy the pool somehow before trying experiments? I think the current state of "disaster recovery" tools and methods for ZFS makes some folks nervous. With so much error checking "built in" there's fewer tried and true "old school" sysadmin approaches to recovering lost data after the fact. So thanks for debugging your problem in public. I hope you can resolve things and document how you did it for everyone. Good luck.