Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 11 Apr 2023 07:47:13 -0700
From:      Cy Schubert <Cy.Schubert@cschubert.com>
To:        =?utf-8?Q?Pawe=C5=82_Jakub_Dawidek?= <pawel@dawidek.net>
Cc:        FreeBSD User <freebsd@walstatt-de.de>, Mateusz Guzik <mjguzik@gmail.com>, Pawel Jakub Dawidek <pjd@freebsd.org>, FreeBSD CURRENT <freebsd-current@freebsd.org>
Subject:   Re: CURRENT: Panic VERIFY(!zil_replaying(zilog, tx)) failed (and  crashing)
Message-ID:  <20230411144713.A94EA5FE@slippy.cwsent.com>
In-Reply-To: <20230411142831.DB8245FA@slippy.cwsent.com>
References:  <20230411021919.0718F306@slippy.cwsent.com>  <434B83DB-F6BB-436F-8AA5-385730D20BB1@dawidek.net>  <20230411142831.DB8245FA@slippy.cwsent.com>

next in thread | previous in thread | raw e-mail | index | archive | help
In message <20230411142831.DB8245FA@slippy.cwsent.com>, Cy Schubert writes:
> In message <434B83DB-F6BB-436F-8AA5-385730D20BB1@dawidek.net>, 
> =?utf-8?Q?Pawe=C
> 5=82_Jakub_Dawidek?= writes:
> > 
> >
> > > On Apr 11, 2023, at 11:31, Cy Schubert <Cy.Schubert@cschubert.com> wrote:
> > >=20
> > > =EF=BB=BFIn message <20230409161436.5412fa6e@thor.intern.walstatt.dynvpn.
> d=
> > e>,=20
> > > FreeBSD Us
> > > er writes:
> > >> Am Sun, 9 Apr 2023 14:37:03 +0200
> > >> Mateusz Guzik <mjguzik@gmail.com> schrieb:
> > >>=20
> > >>>> On 4/9/23, FreeBSD User <freebsd@walstatt-de.de> wrote:
> > >>>>> Today, after upgrading to FreeBSD 14.0-CURRENT #8 main-n262052-0d4038
> e=
> > 301
> > >>> 2b:
> > >>>>> Sun Apr  9
> > >>>>> 12:01:02 CEST 2023  amd64, AND upgrading ZPOOLs via
> > >>>>>=20
> > >>>>> zpool upgrade POOLNAME
> > >>>>>=20
> > >>>>> some boxes keep crashing when starting compiler runs (the trigger is
> > >>>>> different on boxes).
> > >>>>>=20
> > >>>>> ZFS module is statically compiled into the kernel (if this is of
> > >>>>> importance)
> > >>>>>=20
> > >>>>> Last known good was:
> > >>>>>=20
> > >>>>> [...]
> > >>>>> Apr  9 07:10:04 <0.2> thor kernel: FreeBSD 14.0-CURRENT #7
> > >>>>> main-n262051-75379ea2e461: Sun Apr
> > >>>>> 9 00:12:57 CEST 2023 Apr  9 07:10:04 <0.2> thor kernel:
> > >>>>> root@thor:/usr/obj/usr/src/amd64.amd64/sys/THOR amd64 Apr  9 07:10:04
>  <
> > =
> > 0.
> > >>> 2>
> > >>>>> thor kernel:
> > >>>>> FreeBSD clang version 15.0.7 (https://github.com/llvm/llvm-project.gi
> t=
> >
> > >>>>> llvmorg-15.0.7-0-g8dfdcc7b7bf6) Apr  9 07:10:04 <0.2> thor kernel:
> > >>>>> VT(efifb): resolution
> > >>>>> 2560x1440 Apr  9 07:10:04 <0.2> thor kernel: module zfsctrl already
> > >>>>> present!
> > >>>>> [...]
> > >>>>>=20
> > >>>>> The file /var/crash/info.X
> > >>>>>=20
> > >>>>> contains:
> > >>>>>=20
> > >>>>> [...]
> > >>>>>=20
> > >>>>> root@thor:/var/crash # more info.2
> > >>>>> Dump header from device: /dev/gpt/swap
> > >>>>>  Architecture: amd64
> > >>>>>  Architecture Version: 2
> > >>>>>  Dump Length: 1095192576
> > >>>>>  Blocksize: 512
> > >>>>>  Compression: none
> > >>>>>  Dumptime: 2023-04-09 11:43:41 +0000
> > >>>>>  Hostname: thor.local
> > >>>>>  Magic: FreeBSD Kernel Dump
> > >>>>>  Version String: FreeBSD 14.0-CURRENT #8 main-n262052-0d4038e3012b: S
> u=
> > n=20
> > >>> Apr
> > >>>>> 9 12:01:02 CEST
> > >>>>> 2023
> > >>>>>    root@thor:/usr/obj/usr/src/amd64.amd64/sys/THOR
> > >>>>>  Panic String: VERIFY(!zil_replaying(zilog, tx)) failed
> > >>>>>=20
> > >>>>>  Dump Parity: 2961465682
> > >>>>>  Bounds: 2
> > >>>>>  Dump Status: good
> > >>>>>=20
> > >>>>> Until reconfigured for more debug stuff I do not have more to present
> .=
> >
> > >>>>>=20
> > >>>>> I rememeber now really scraed that there was a HEADSUP in the list re
> g=
> > ard
> > >>> ing
> > >>>>> some serious ZFS
> > >>>>> problems - I didn't find it right now.
> > >>>>>=20
> > >>>>> Thanks in advance,
> > >>>>>=20
> > >>>=20
> > >>> That's fallout from the new block cloning feature, adding the author
> > >>>=20
> > >>=20
> > >> Thanks.
> > >>=20
> > >> As of this moment, all systems with the newest kernel and the new ZFS op
> t=
> > ion=20
> > >> enabled, crash -
> > >> the reason is mostly in  different ZFS datasets. I guess there is no way
>  b
> > =
> > ack
> > >> once this faulty
> > >> option is enabled?
> > >=20
> > > I've run a test on a scratch pool here, first without block_cloning=20
> > > enabled, then with. There was no corruption when block_cloning was=20
> > > disabled. There was corruption when block_cloning was enabled.
> > >=20
> > > I don't know of any way to revert back nor is there any way to fix or=20
> > > recover the corrupted blocks.
> >
> > Is the corruption still present after EXDEV fixes?
>
> Yes and no.
>
> Yes, there is corruption when block_cloning is enabled.
>
> There is no corruption when block_cloning is disabled.

I should add some detail to this.

The corruption experienced when block cloning is disabled was fixed by:

- eb1feadc201a
- e2d997d1cbb9
- d012836fb616 (specifically this commit)
- 20be1b4fc4b7

When block_cloning is enabled, the pool is corrupted. This has not been 
fixed.


-- 
Cheers,
Cy Schubert <Cy.Schubert@cschubert.com>
FreeBSD UNIX:  <cy@FreeBSD.org>   Web:  https://FreeBSD.org
NTP:           <cy@nwtime.org>    Web:  https://nwtime.org

			e^(i*pi)+1=0





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20230411144713.A94EA5FE>