Date: Sun, 6 Feb 2022 13:23:40 -0500 From: Rich <rincebrain@gmail.com> To: John F Carr <jfc@mit.edu> Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org> Subject: Re: Repairing a bad ZFS free list Message-ID: <CAOeNLup_tf_BFYV6Nrrbr2WyTaC9n0dAekOt8PCSUjW6xS0oaA@mail.gmail.com> In-Reply-To: <84C3247E-B5F0-4572-AE38-3B530D61CB1C@exchange.mit.edu> References: <84C3247E-B5F0-4572-AE38-3B530D61CB1C@exchange.mit.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
--000000000000c5ea9b05d75d95db Content-Type: text/plain; charset="UTF-8" https://github.com/openzfs/zfs/issues/11480 seems germane. I'm not 100% certain from reading the fix, but it seems like applying the patch should result in no longer panicking. - Rich On Sun, Feb 6, 2022 at 1:10 PM John F Carr <jfc@mit.edu> wrote: > I have a corrupt root ZFS pool on my ARM server (Ampere eMAG) running > a recent version of stable/13. Is there any way to repair my system > short of wiping the disk and reinstalling? > > All filesystems mount and there are no errors reported by zpool, but > there is bad metadata, apparently a block having been allocated twice. > Running "zfs destroy" tends to cause crashes like > > panic: VERIFY3(l->blk_birth == r->blk_birth) failed (9269896 == 9269889) > > The assertion is in dsl_deadlist.c:livelist_compare(). There are two > livelist_entry_t objects containing blkptr_t objects with the same > DVA_GET_VDEV and DVA_GET_OFFSET but distinct blk_birth. Apparently > this is a bad thing. > > spa_livelist_delete_cb appears in the stack trace. I think the kernel is > telling > me the same block has been allocated twice and it doesn't want to free it > twice. > > This problem persists across reboot. Since I want to use poudriere > "stop running zfs destroy" is not a good workaround. > > Is it safe to disable the assertion, or will that spread the > corruption even further? > > In the old days I would use clri or fsdb to make the problematic part > of a UFS filesystem go away. How do I repair ZFS? > > This crash has been reported as bug > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=261538 > > > --000000000000c5ea9b05d75d95db Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><a href=3D"https://github.com/openzfs/zfs/issues/11480">ht= tps://github.com/openzfs/zfs/issues/11480</a> seems germane.<div><br></div>= <div>I'm not 100% certain from reading the fix, but it seems like apply= ing the patch should result in no longer panicking.</div><div><br></div><di= v>- Rich</div></div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class= =3D"gmail_attr">On Sun, Feb 6, 2022 at 1:10 PM John F Carr <<a href=3D"m= ailto:jfc@mit.edu">jfc@mit.edu</a>> wrote:<br></div><blockquote class=3D= "gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(2= 04,204,204);padding-left:1ex">I have a corrupt root ZFS pool on my ARM serv= er (Ampere eMAG) running<br> a recent version of stable/13.=C2=A0 Is there any way to repair my system<b= r> short of wiping the disk and reinstalling?<br> <br> All filesystems mount and there are no errors reported by zpool, but<br> there is bad metadata, apparently a block having been allocated twice.<br> Running "zfs destroy" tends to cause crashes like<br> <br> panic: VERIFY3(l->blk_birth =3D=3D r->blk_birth) failed (9269896 =3D= =3D 9269889)<br> <br> The assertion is in dsl_deadlist.c:livelist_compare().=C2=A0 There are two<= br> livelist_entry_t objects containing blkptr_t objects with the same<br> DVA_GET_VDEV and DVA_GET_OFFSET but distinct blk_birth.=C2=A0 Apparently<br= > this is a bad thing.<br> <br> spa_livelist_delete_cb appears in the stack trace.=C2=A0 I think the kernel= is telling<br> me the same block has been allocated twice and it doesn't want to free = it twice.<br> <br> This problem persists across reboot.=C2=A0 Since I want to use poudriere<br= > "stop running zfs destroy" is not a good workaround.<br> <br> Is it safe to disable the assertion, or will that spread the<br> corruption even further?<br> <br> In the old days I would use clri or fsdb to make the problematic part<br> of a UFS filesystem go away.=C2=A0 How do I repair ZFS?<br> <br> This crash has been reported as bug<br> <a href=3D"https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D261538" rel= =3D"noreferrer" target=3D"_blank">https://bugs.freebsd.org/bugzilla/show_bu= g.cgi?id=3D261538</a><br> <br> <br> </blockquote></div> --000000000000c5ea9b05d75d95db--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOeNLup_tf_BFYV6Nrrbr2WyTaC9n0dAekOt8PCSUjW6xS0oaA>