From owner-freebsd-fs@FreeBSD.ORG Wed Oct 14 06:21:16 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D6BC4106566B for ; Wed, 14 Oct 2009 06:21:16 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206049004.chello.pl [87.206.49.4]) by mx1.freebsd.org (Postfix) with ESMTP id 25A378FC0A for ; Wed, 14 Oct 2009 06:21:15 +0000 (UTC) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id E0EBA45E94; Wed, 14 Oct 2009 08:21:13 +0200 (CEST) Received: from localhost (chello087206049004.chello.pl [87.206.49.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id B2B1445B36; Wed, 14 Oct 2009 08:21:08 +0200 (CEST) Date: Wed, 14 Oct 2009 08:21:07 +0200 From: Pawel Jakub Dawidek To: Alex Trull Message-ID: <20091014062107.GB1696@garage.freebsd.pl> References: <4d98b5320910110741w794c154cs22b527485c1938da@mail.gmail.com> <4d98b5320910110927o62f8f588r9acdeb40a19587ea@mail.gmail.com> <4d98b5320910121249q36c68b8vf63ec27cf4bb94c9@mail.gmail.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="XF85m9dhOBO43t/C" Content-Disposition: inline In-Reply-To: <4d98b5320910121249q36c68b8vf63ec27cf4bb94c9@mail.gmail.com> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 9.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-0.6 required=4.5 tests=BAYES_00,RCVD_IN_SORBS_DUL autolearn=no version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: zraid2 loses a single disk and becomes difficult to recover X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Oct 2009 06:21:16 -0000 --XF85m9dhOBO43t/C Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Oct 12, 2009 at 08:49:37PM +0100, Alex Trull wrote: > I managed to cleanly recover all critical data by cloning the most recent > snapshots of all my filesystems (which worked even for those filesystems > that had disappeared from 'zfs list') - and moving back to ufs2 >=20 > The 'live' filesystems since the snapshots had pretty much gone corrupt. >=20 > Intereresting note is that even if I promoted those clones - if the system > was rebooted the contents of the snapshots became gobbledygooked (invalid > byte sequence errors on numerous files). >=20 > As it stands I managed to recover 100% of the data, so I'm out the woods. I'm glad to hear that. > How does a dual-parity array lose its mind when only one disk is lost ? > Might it have been related to the old TXGid I found on ad16 and ad17 ? Yes, definiately. For some reason ZFS didn't update txg on those two disks, so at this point you were running without parity. The problem is that ZFS didn't start resilver automatically and also didn't report this situation properly. I think I saw this in the past. Running 'zpool scrub' on this pool will trigger resilver. There must be a bug. I tried to reproduce it by modifying the code not to update txg on one of the components. There are three places where this can happen on sytem crash/power failure and I tried all of them - no luck, ZFS was able to recover properly. It would be good idea to run 'zpool scrub' on regular basis, even if only to see if it won't trigger resilver (it can be stopped after few minutes with 'zpool scrub -s'). Of course it is adviced to run full scrub from time to time. Do you have this pool around still? --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --XF85m9dhOBO43t/C Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFK1W3SForvXbEpPzQRAntPAKDRIJlRaFazDnVyQ836Zgksdeg7+wCgzV+Z 3+DBuZkOEgeihv4p3OXMyYI= =JN8d -----END PGP SIGNATURE----- --XF85m9dhOBO43t/C--