Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 7 Jan 2023 17:41:59 +0100
From:      Fabian Keil <freebsd-listen@fabiankeil.de>
To:        freebsd-hackers@freebsd.org
Subject:   ZFS-related panic(s) with zfs-2.1.7-FreeBSD_g21bd76613?
Message-ID:  <20230107174159.1b7e61e9@fabiankeil.de>

next in thread | raw e-mail | index | archive | help
--Sig_/b=hAGdo3Z+CY0ACQI+n1yOn
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

Yesterday I rebased ElectroBSD [0] on stable/13 77c0992af4e3b
while it was previously based on stable/13 d3b97a1ea0123.

I didn't notice any issues in a test VM and therefore decided
to update my laptop as well.

So far I've experienced three panics/reboots/freezes that I suspect
might be caused by the upgrade from zfs-2.1.6-FreeBSD_g6a6bd4939
to zfs-2.1.7-FreeBSD_g21bd76613.

They all occurred while I was syncing ZFS datasets with zogftw [0].

Unfortunately I only have one backtrace so I can't say for
sure that the other times where ZFS related as well:

Unread portion of the kernel message buffer:
panic: VERIFY3(0 =3D=3D zap_remove(mos, dsobj, spa_feature_table[f].fi_guid=
, tx)) failed (0 =3D=3D 2)

cpuid =3D 3
time =3D 1673033419
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00dc686=
8a0
vpanic() at vpanic+0x151/frame 0xfffffe00dc6868f0
spl_panic() at spl_panic+0x3a/frame 0xfffffe00dc686950
dsl_dataset_deactivate_feature_impl() at dsl_dataset_deactivate_feature_imp=
l+0xe6/frame 0xfffffe00dc6869a0
dsl_dataset_clone_swap_sync_impl() at dsl_dataset_clone_swap_sync_impl+0x13=
5/frame 0xfffffe00dc686ad0
dmu_recv_end_sync() at dmu_recv_end_sync+0x2a2/frame 0xfffffe00dc686b30
dsl_sync_task_sync() at dsl_sync_task_sync+0xb4/frame 0xfffffe00dc686b60
dsl_pool_sync() at dsl_pool_sync+0x42b/frame 0xfffffe00dc686be0
spa_sync() at spa_sync+0xb00/frame 0xfffffe00dc686e10
txg_sync_thread() at txg_sync_thread+0x281/frame 0xfffffe00dc686ef0
fork_exit() at fork_exit+0x7e/frame 0xfffffe00dc686f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00dc686f30
--- trap 0x3, rip =3D 0xffffffff80659b3f, rsp =3D 0, rbp =3D 0xffffffff818f=
4fa0 ---
mi_startup() at mi_startup+0xdf/frame 0xffffffff818f4fa0
swapper() at swapper+0x69/frame 0xffffffff818f4ff0
btext() at btext+0x22
Uptime: 6m35s
Dumping 1098 out of 8050 MB:..2%..11%..21%..31%..41%..51%..62%..72%..81%..9=
1%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55		__asm("movq %%gs:%P1,%0" : "=3Dr" (td) : "n" (offsetof(struct pcpu,
(kgdb) where
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  dump_savectx () at /usr/src/sys/kern/kern_shutdown.c:394
#2  0xffffffff806cda18 in dumpsys (di=3D0x0) at /usr/src/sys/x86/include/du=
mp.h:87
#3  doadump (textdump=3D1) at /usr/src/sys/kern/kern_shutdown.c:423
#4  kern_reboot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c:497
#5  0xffffffff806cde9e in vpanic (fmt=3D<optimized out>, ap=3Dap@entry=3D0x=
fffffe00dc686930) at /usr/src/sys/kern/kern_shutdown.c:930
#6  0xffffffff81278e3a in spl_panic (file=3D<optimized out>, func=3D<optimi=
zed out>, line=3D<unavailable>, fmt=3D<unavailable>) at /usr/src/sys/contri=
b/openzfs/module/os/freebsd/spl/spl_misc.c:107
#7  0xffffffff813001e6 in dsl_dataset_deactivate_feature_impl (ds=3Dds@entr=
y=3D0xfffff80019a60000, f=3Df@entry=3DSPA_FEATURE_USEROBJ_ACCOUNTING, tx=3D=
tx@entry=3D0xfffff80191ace200)
    at /usr/src/sys/contrib/openzfs/module/zfs/dsl_dataset.c:1116
#8  0xffffffff81304cb5 in dsl_dataset_clone_swap_sync_impl (clone=3D0xfffff=
8018fc79000, origin_head=3D<unavailable>, tx=3D<unavailable>, tx@entry=3D0x=
fffff80191ace200) at /usr/src/sys/contrib/openzfs/module/zfs/dsl_dataset.c:=
4083
#9  0xffffffff812eaff2 in dmu_recv_end_sync (arg=3D0xfffffe00d91366b8, tx=
=3D0xfffff80191ace200) at /usr/src/sys/contrib/openzfs/module/zfs/dmu_recv.=
c:3233
#10 0xffffffff8132c254 in dsl_sync_task_sync (dst=3D0xfffffe00d91364a8, tx=
=3Dtx@entry=3D0xfffff80191ace200) at /usr/src/sys/contrib/openzfs/module/zf=
s/dsl_synctask.c:248
#11 0xffffffff8131ea6b in dsl_pool_sync (dp=3Ddp@entry=3D0xfffff801eabad800=
, txg=3Dtxg@entry=3D3576757) at /usr/src/sys/contrib/openzfs/module/zfs/dsl=
_pool.c:847
#12 0xffffffff81353930 in spa_sync_iterate_to_convergence (spa=3D0xfffffe00=
da149000, tx=3D0xfffff80191e73400) at /usr/src/sys/contrib/openzfs/module/z=
fs/spa.c:9069
#13 spa_sync (spa=3Dspa@entry=3D0xfffffe00da149000, txg=3Dtxg@entry=3D35767=
57) at /usr/src/sys/contrib/openzfs/module/zfs/spa.c:9287
#14 0xffffffff81368281 in txg_sync_thread (arg=3Darg@entry=3D0xfffff801eaba=
d800) at /usr/src/sys/contrib/openzfs/module/zfs/txg.c:591
#15 0xffffffff80689fde in fork_exit (callout=3D0xffffffff81368000 <txg_sync=
_thread>, arg=3D0xfffff801eabad800, frame=3D0xfffffe00dc686f40) at /usr/src=
/sys/kern/kern_fork.c:1093
#16 <signal handler called>
#17 mi_startup () at /usr/src/sys/kern/init_main.c:322
#18 0xffffffff80a1e439 in swapper () at /usr/src/sys/vm/vm_swapout.c:755
#19 0xffffffff802f8722 in btext () at /usr/src/sys/amd64/amd64/locore.S:80

Has anyone else seen this?

I've seen it with three different ZFS pools and I think the pools
are fine. The laptop only supports USB2 so scrubbing the pools takes
days which is why I didn't do it yet.

I have no reliable way to reproduce the issue yet.
Running zogftw sync again after rebooting worked in all three cases.

Fabian

[0] <https://www.fabiankeil.de/gehacktes/electrobsd/>;
[1] <https://www.fabiankeil.de/gehacktes/zogftw/>;

--Sig_/b=hAGdo3Z+CY0ACQI+n1yOn
Content-Type: application/pgp-signature
Content-Description: OpenPGP digital signature

-----BEGIN PGP SIGNATURE-----

iF0EARECAB0WIQTKUNd6H/m3+ByGULIFiohV/3dUnQUCY7mg2AAKCRAFiohV/3dU
nVeTAJwOJxTyRr8I094jXfB8FFndQhrgEgCgpRGdtjsZHCdb/PEFq5roM9bd6DU=
=7ROx
-----END PGP SIGNATURE-----

--Sig_/b=hAGdo3Z+CY0ACQI+n1yOn--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20230107174159.1b7e61e9>