Date: Tue, 7 Feb 2023 07:02:23 +0100 From: Fabian Keil <freebsd-listen@fabiankeil.de> To: freebsd-hackers@freebsd.org Subject: Re: ZFS-related panic(s) with zfs-2.1.7-FreeBSD_g21bd76613? Message-ID: <20230207070223.1195e73b@fabiankeil.de> In-Reply-To: <20230125111011.455923bf@fabiankeil.de> References: <20230107174159.1b7e61e9@fabiankeil.de> <20230125111011.455923bf@fabiankeil.de>
next in thread | previous in thread | raw e-mail | index | archive | help
--Sig_/I=o8=qUc/1zYY0gg+IqzxTX Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Fabian Keil <freebsd-listen@fabiankeil.de> wrote on 2023-01-25 at 11:10:11: > Fabian Keil <freebsd-listen@fabiankeil.de> wrote on 2023-01-07 at 17:41:5= 9: >=20 > > Yesterday I rebased ElectroBSD [0] on stable/13 77c0992af4e3b > > while it was previously based on stable/13 d3b97a1ea0123. > >=20 > > I didn't notice any issues in a test VM and therefore decided > > to update my laptop as well. > >=20 > > So far I've experienced three panics/reboots/freezes that I suspect > > might be caused by the upgrade from zfs-2.1.6-FreeBSD_g6a6bd4939 > > to zfs-2.1.7-FreeBSD_g21bd76613. > >=20 > > They all occurred while I was syncing ZFS datasets with zogftw [0]. > >=20 > > Unfortunately I only have one backtrace so I can't say for > > sure that the other times where ZFS related as well: > >=20 > > Unread portion of the kernel message buffer: > > panic: VERIFY3(0 =3D=3D zap_remove(mos, dsobj, spa_feature_table[f].fi_= guid, tx)) failed (0 =3D=3D 2) > >=20 > > cpuid =3D 3 > > time =3D 1673033419 > > KDB: stack backtrace: > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00d= c6868a0 > > vpanic() at vpanic+0x151/frame 0xfffffe00dc6868f0 > > spl_panic() at spl_panic+0x3a/frame 0xfffffe00dc686950 > > dsl_dataset_deactivate_feature_impl() at dsl_dataset_deactivate_feature= _impl+0xe6/frame 0xfffffe00dc6869a0 > > dsl_dataset_clone_swap_sync_impl() at dsl_dataset_clone_swap_sync_impl+= 0x135/frame 0xfffffe00dc686ad0 > > dmu_recv_end_sync() at dmu_recv_end_sync+0x2a2/frame 0xfffffe00dc686b30 > > dsl_sync_task_sync() at dsl_sync_task_sync+0xb4/frame 0xfffffe00dc686b60 > > dsl_pool_sync() at dsl_pool_sync+0x42b/frame 0xfffffe00dc686be0 > > spa_sync() at spa_sync+0xb00/frame 0xfffffe00dc686e10 > > txg_sync_thread() at txg_sync_thread+0x281/frame 0xfffffe00dc686ef0 > > fork_exit() at fork_exit+0x7e/frame 0xfffffe00dc686f30 > > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00dc686f30 > > --- trap 0x3, rip =3D 0xffffffff80659b3f, rsp =3D 0, rbp =3D 0xffffffff= 818f4fa0 --- > > mi_startup() at mi_startup+0xdf/frame 0xffffffff818f4fa0 > > swapper() at swapper+0x69/frame 0xffffffff818f4ff0 > > btext() at btext+0x22 > > Uptime: 6m35s > > Dumping 1098 out of 8050 MB:..2%..11%..21%..31%..41%..51%..62%..72%..81= %..91% > >=20 > > __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55 > > 55 __asm("movq %%gs:%P1,%0" : "=3Dr" (td) : "n" (offsetof(struct pcpu, > > (kgdb) where > > #0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55 > > #1 dump_savectx () at /usr/src/sys/kern/kern_shutdown.c:394 > > #2 0xffffffff806cda18 in dumpsys (di=3D0x0) at /usr/src/sys/x86/includ= e/dump.h:87 > > #3 doadump (textdump=3D1) at /usr/src/sys/kern/kern_shutdown.c:423 > > #4 kern_reboot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c:497 > > #5 0xffffffff806cde9e in vpanic (fmt=3D<optimized out>, ap=3Dap@entry= =3D0xfffffe00dc686930) at /usr/src/sys/kern/kern_shutdown.c:930 > > #6 0xffffffff81278e3a in spl_panic (file=3D<optimized out>, func=3D<op= timized out>, line=3D<unavailable>, fmt=3D<unavailable>) at /usr/src/sys/co= ntrib/openzfs/module/os/freebsd/spl/spl_misc.c:107 > > #7 0xffffffff813001e6 in dsl_dataset_deactivate_feature_impl (ds=3Dds@= entry=3D0xfffff80019a60000, f=3Df@entry=3DSPA_FEATURE_USEROBJ_ACCOUNTING, t= x=3Dtx@entry=3D0xfffff80191ace200) > > at /usr/src/sys/contrib/openzfs/module/zfs/dsl_dataset.c:1116 > > #8 0xffffffff81304cb5 in dsl_dataset_clone_swap_sync_impl (clone=3D0xf= ffff8018fc79000, origin_head=3D<unavailable>, tx=3D<unavailable>, tx@entry= =3D0xfffff80191ace200) at /usr/src/sys/contrib/openzfs/module/zfs/dsl_datas= et.c:4083 > > #9 0xffffffff812eaff2 in dmu_recv_end_sync (arg=3D0xfffffe00d91366b8, = tx=3D0xfffff80191ace200) at /usr/src/sys/contrib/openzfs/module/zfs/dmu_rec= v.c:3233 > > #10 0xffffffff8132c254 in dsl_sync_task_sync (dst=3D0xfffffe00d91364a8,= tx=3Dtx@entry=3D0xfffff80191ace200) at /usr/src/sys/contrib/openzfs/module= /zfs/dsl_synctask.c:248 > > #11 0xffffffff8131ea6b in dsl_pool_sync (dp=3Ddp@entry=3D0xfffff801eaba= d800, txg=3Dtxg@entry=3D3576757) at /usr/src/sys/contrib/openzfs/module/zfs= /dsl_pool.c:847 > > #12 0xffffffff81353930 in spa_sync_iterate_to_convergence (spa=3D0xffff= fe00da149000, tx=3D0xfffff80191e73400) at /usr/src/sys/contrib/openzfs/modu= le/zfs/spa.c:9069 > > #13 spa_sync (spa=3Dspa@entry=3D0xfffffe00da149000, txg=3Dtxg@entry=3D3= 576757) at /usr/src/sys/contrib/openzfs/module/zfs/spa.c:9287 > > #14 0xffffffff81368281 in txg_sync_thread (arg=3Darg@entry=3D0xfffff801= eabad800) at /usr/src/sys/contrib/openzfs/module/zfs/txg.c:591 > > #15 0xffffffff80689fde in fork_exit (callout=3D0xffffffff81368000 <txg_= sync_thread>, arg=3D0xfffff801eabad800, frame=3D0xfffffe00dc686f40) at /usr= /src/sys/kern/kern_fork.c:1093 > > #16 <signal handler called> > > #17 mi_startup () at /usr/src/sys/kern/init_main.c:322 > > #18 0xffffffff80a1e439 in swapper () at /usr/src/sys/vm/vm_swapout.c:755 > > #19 0xffffffff802f8722 in btext () at /usr/src/sys/amd64/amd64/locore.S= :80 > >=20 > > Has anyone else seen this? > >=20 > > I've seen it with three different ZFS pools and I think the pools > > are fine. The laptop only supports USB2 so scrubbing the pools takes > > days which is why I didn't do it yet. > >=20 > > I have no reliable way to reproduce the issue yet. > > Running zogftw sync again after rebooting worked in all three cases. > [...] > > [0] <https://www.fabiankeil.de/gehacktes/electrobsd/> > > [1] <https://www.fabiankeil.de/gehacktes/zogftw/> >=20 > I'm still getting panics with the stack trace above with > various pools about once a day on my work laptop and am > considering reverting the ZFS-related commits. Just for the archive: A couple of days ago I rebased ElectroBSD on my laptop on stable/13 20cfc902d911a and as a result I'm now using zfs-2.1.9-FreeBSD_g92e0d9d18 and zfs-kmod-2.1.9-FreeBSD_g92e0d9d18 there. With this combination I haven't seen any ZFS-related panics yet even though I syncronized a bunch of different ZFS pools with zogftw. Fabian --Sig_/I=o8=qUc/1zYY0gg+IqzxTX Content-Type: application/pgp-signature Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- iF0EARECAB0WIQTKUNd6H/m3+ByGULIFiohV/3dUnQUCY+HpcAAKCRAFiohV/3dU nZwdAJ9+t+sVxz685ka+NAmcODnmSGS0SQCfQrepK9wZVr1emxlmwhDG5+RQjLg= =7FSy -----END PGP SIGNATURE----- --Sig_/I=o8=qUc/1zYY0gg+IqzxTX--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20230207070223.1195e73b>