Date: Thu, 7 Sep 2023 11:02:16 -0700 From: Mark Millard <marklmi@yahoo.com> To: Current FreeBSD <freebsd-current@freebsd.org>, FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org> Cc: Glen Barber <gjb@FreeBSD.org> Subject: main [and, likely, stable/14]: do not set vfs.zfs.bclone_enabled=1 with that zpool feature enabled because it still leads to panics Message-ID: <7CE2CAAF-8BB0-4422-B194-4A6B0A4BC12C@yahoo.com> References: <7CE2CAAF-8BB0-4422-B194-4A6B0A4BC12C.ref@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
I was requested to do a test with vfs.zfs.bclone_enabled=3D1 and the bulk -a build paniced (having stored 128 *.pkg files in .building/ first): # more /var/crash/core.txt.3 . . . Unread portion of the kernel message buffer: panic: Solaris(panic): zfs: accessing past end of object 422/1108c16 = (size=3D2560 access=3D2560+2560) cpuid =3D 15 time =3D 1694103674 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame = 0xfffffe0352758590 vpanic() at vpanic+0x132/frame 0xfffffe03527586c0 panic() at panic+0x43/frame 0xfffffe0352758720 vcmn_err() at vcmn_err+0xeb/frame 0xfffffe0352758850 zfs_panic_recover() at zfs_panic_recover+0x59/frame 0xfffffe03527588b0 dmu_buf_hold_array_by_dnode() at dmu_buf_hold_array_by_dnode+0x97/frame = 0xfffffe0352758960 dmu_brt_clone() at dmu_brt_clone+0x61/frame 0xfffffe03527589f0 zfs_clone_range() at zfs_clone_range+0xa6a/frame 0xfffffe0352758bc0 zfs_freebsd_copy_file_range() at zfs_freebsd_copy_file_range+0x1ae/frame = 0xfffffe0352758c40 vn_copy_file_range() at vn_copy_file_range+0x11e/frame = 0xfffffe0352758ce0 kern_copy_file_range() at kern_copy_file_range+0x338/frame = 0xfffffe0352758db0 sys_copy_file_range() at sys_copy_file_range+0x78/frame = 0xfffffe0352758e00 amd64_syscall() at amd64_syscall+0x109/frame 0xfffffe0352758f30 fast_syscall_common() at fast_syscall_common+0xf8/frame = 0xfffffe0352758f30 --- syscall (569, FreeBSD ELF64, copy_file_range), rip =3D = 0x1ce4506d155a, rsp =3D 0x1ce44ec71e88, rbp =3D 0x1ce44ec72320 --- KDB: enter: panic __curthread () at /usr/main-src/sys/amd64/include/pcpu_aux.h:57 57 __asm("movq %%gs:%P1,%0" : "=3Dr" (td) : "n" = (offsetof(struct pcpu, (kgdb) #0 __curthread () at = /usr/main-src/sys/amd64/include/pcpu_aux.h:57 #1 doadump (textdump=3Dtextdump@entry=3D0) at /usr/main-src/sys/kern/kern_shutdown.c:405 #2 0xffffffff804a442a in db_dump (dummy=3D<optimized out>, = dummy2=3D<optimized out>, dummy3=3D<optimized out>, dummy4=3D<optimized = out>) at /usr/main-src/sys/ddb/db_command.c:591 #3 0xffffffff804a422d in db_command (last_cmdp=3D<optimized out>, = cmd_table=3D<optimized out>, dopager=3Dtrue) at /usr/main-src/sys/ddb/db_command.c:504 #4 0xffffffff804a3eed in db_command_loop () at /usr/main-src/sys/ddb/db_command.c:551 #5 0xffffffff804a7876 in db_trap (type=3D<optimized out>, = code=3D<optimized out>) at /usr/main-src/sys/ddb/db_main.c:268 #6 0xffffffff80bb9e57 in kdb_trap (type=3Dtype@entry=3D3, = code=3Dcode@entry=3D0, tf=3Dtf@entry=3D0xfffffe03527584d0) at = /usr/main-src/sys/kern/subr_kdb.c:790 #7 0xffffffff8104ad3d in trap (frame=3D0xfffffe03527584d0) at /usr/main-src/sys/amd64/amd64/trap.c:608 #8 <signal handler called> #9 kdb_enter (why=3D<optimized out>, msg=3D<optimized out>) at /usr/main-src/sys/kern/subr_kdb.c:556 #10 0xffffffff80b6aab3 in vpanic (fmt=3D0xffffffff82be52d6 "%s%s", = ap=3Dap@entry=3D0xfffffe0352758700) at /usr/main-src/sys/kern/kern_shutdown.c:958 #11 0xffffffff80b6a943 in panic ( fmt=3D0xffffffff820aa2e8 <vt_conswindow+16> = "\312C$\201\377\377\377\377") at /usr/main-src/sys/kern/kern_shutdown.c:894 #12 0xffffffff82993c5b in vcmn_err (ce=3D<optimized out>, = fmt=3D0xffffffff82bfdd1f "zfs: accessing past end of object %llx/%llx = (size=3D%u access=3D%llu+%llu)", adx=3D0xfffffe0352758890) at = /usr/main-src/sys/contrib/openzfs/module/os/freebsd/spl/spl_cmn_err.c:60 #13 0xffffffff82a84d69 in zfs_panic_recover ( fmt=3D0x12 <error: Cannot access memory at address 0x12>) at /usr/main-src/sys/contrib/openzfs/module/zfs/spa_misc.c:1594 #14 0xffffffff829f8e27 in dmu_buf_hold_array_by_dnode = (dn=3D0xfffff813dfc48978, offset=3Doffset@entry=3D2560, = length=3Dlength@entry=3D2560, read=3Dread@entry=3D0, = tag=3D0xffffffff82bd8175, numbufsp=3Dnumbufsp@entry=3D0xfffffe03527589bc, = dbpp=3D0xfffffe03527589c0, flags=3D0) at /usr/main-src/sys/contrib/openzfs/module/zfs/dmu.c:543 #15 0xffffffff829fc6a1 in dmu_buf_hold_array (os=3D<optimized out>, = object=3D<optimized out>, read=3D0, numbufsp=3D0xfffffe03527589bc, = dbpp=3D0xfffffe03527589c0, offset=3D<optimized out>, length=3D<optimized = out>, tag=3D<optimized out>) at /usr/main-src/sys/contrib/openzfs/module/zfs/dmu.c:654 #16 dmu_brt_clone (os=3Dos@entry=3D0xfffff8010ae0e000, object=3D<optimized= out>, offset=3Doffset@entry=3D2560, length=3Dlength@entry=3D2560, = tx=3Dtx@entry=3D0xfffff81aaeb6e100, = bps=3Dbps@entry=3D0xfffffe0595931000, nbps=3D1, replay=3D0) at = /usr/main-src/sys/contrib/openzfs/module/zfs/dmu.c:2301 #17 0xffffffff82b4440a in zfs_clone_range (inzp=3D0xfffff8100054c910, = inoffp=3D0xfffff81910c3c7c8, outzp=3D0xfffff80fb3233000, = outoffp=3D0xfffff819860a2c78, lenp=3Dlenp@entry=3D0xfffffe0352758c00, = cr=3D0xfffff80e32335200) at /usr/main-src/sys/contrib/openzfs/module/zfs/zfs_vnops.c:1302 #18 0xffffffff829b4ece in zfs_freebsd_copy_file_range = (ap=3D0xfffffe0352758c58) at = /usr/main-src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c:629= 4 #19 0xffffffff80c7160e in VOP_COPY_FILE_RANGE (invp=3D<optimized out>, = inoffp=3D0x40, outvp=3D0xfffffe03527581d0, = outoffp=3D0xffffffff811e6eb7, lenp=3D0x0, flags=3D0, = incred=3D0xfffff80e32335200, outcred=3D0x0, = fsizetd=3D0xfffffe03586c0720) at ./vnode_if.h:2381 #20 vn_copy_file_range (invp=3Dinvp@entry=3D0xfffff8095e1e8000, = inoffp=3D0x40, inoffp@entry=3D0xfffff81910c3c7c8, = outvp=3D0xfffffe03527581d0, outvp@entry=3D0xfffff805d6107380, = outoffp=3D0xffffffff811e6eb7, outoffp@entry=3D0xfffff819860a2c78, = lenp=3D0x0, lenp@entry=3D0xfffffe0352758d50, flags=3Dflags@entry=3D0,= incred=3D0xfffff80e32335200, outcred=3D0xfffff80e32335200, = fsize_td=3D0xfffffe03586c0720) at = /usr/main-src/sys/kern/vfs_vnops.c:3085 #21 0xffffffff80c6b998 in kern_copy_file_range ( td=3Dtd@entry=3D0xfffffe03586c0720, infd=3D<optimized out>, = inoffp=3D0xfffff81910c3c7c8, inoffp@entry=3D0x0, outfd=3D<optimized = out>, outoffp=3D0xfffff819860a2c78, outoffp@entry=3D0x0, = len=3D9223372036854775807, flags=3D0) at = /usr/main-src/sys/kern/vfs_syscalls.c:4971 #22 0xffffffff80c6bab8 in sys_copy_file_range (td=3D0xfffffe03586c0720, = uap=3D0xfffffe03586c0b20) at = /usr/main-src/sys/kern/vfs_syscalls.c:5009 #23 0xffffffff8104bab9 in syscallenter (td=3D0xfffffe03586c0720) at /usr/main-src/sys/amd64/amd64/../../kern/subr_syscall.c:187 #24 amd64_syscall (td=3D0xfffffe03586c0720, traced=3D0) at /usr/main-src/sys/amd64/amd64/trap.c:1197 #25 <signal handler called> #26 0x00001ce4506d155a in ?? () Backtrace stopped: Cannot access memory at address 0x1ce44ec71e88 (kgdb)=20 Context details follow. Absent a openzfs-2.2 in: ls -C1 /usr/share/zfs/compatibility.d/openzfs-2.* /usr/share/zfs/compatibility.d/openzfs-2.0-freebsd /usr/share/zfs/compatibility.d/openzfs-2.0-linux /usr/share/zfs/compatibility.d/openzfs-2.1-freebsd /usr/share/zfs/compatibility.d/openzfs-2.1-linux I have copied: /usr/main-src/sys/contrib/openzfs/cmd/zpool/compatibility.d/openzfs-2.2 over to: # ls -C1 /etc/zfs/compatibility.d/* /etc/zfs/compatibility.d/openzfs-2.2 and used it: # zpool get compatibility zamd64 NAME PROPERTY VALUE SOURCE zamd64 compatibility openzfs-2.2 local For reference: # zpool upgrade This system supports ZFS pool feature flags. All pools are formatted using feature flags. Some supported features are not enabled on the following pools. Once a feature is enabled the pool may become incompatible with software that does not support the feature. See zpool-features(7) for details. Note that the pool 'compatibility' feature can be used to inhibit feature upgrades. POOL FEATURE --------------- zamd64 redaction_list_spill which agrees with openzfs-2.2 . I did: # sysctl vfs.zfs.bclone_enabled=3D1 vfs.zfs.bclone_enabled: 0 -> 1 I also made a snapshot: zamd64@before-bclone-test and I then made a checkpoint. These were establshed just after the above enable. I then did a: zpool trim -w zamd64 The poudriere bulk command was: poudriere bulk -jmain-amd64-bulk_a -a where main-amd64-bulk_a has nothing prebuilt. USE_TMPFS=3Dno is in use. No form of ALLOW_MAKE_JOBS is in use. It is a 32 builder context (32 hardware threads). For reference: # uname -apKU FreeBSD amd64-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT amd64 1500000 #118 = main-n265152-f49d6f583e9d-dirty: Mon Sep 4 14:26:56 PDT 2023 = root@amd64_ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.a= md64/sys/GENERIC-NODBG amd64 amd64 1500000 1500000 I'll note that with openzfs-2.1-freebsd compatibility I'd previously let such a bulk -a run for about 10 hr and it had reached 6366 port->package builds. Prior to that I'd done shorter experiments with default zpool features (no explicit compatibility constraint) but vfs.zfs.bclone_enabled=3D0 and I'd had no problems. (I have a separate M.2 boot media just for such experiments and can reconstruct its content at will.) All these have been based on the same personal main-n265152-f49d6f583e9d-dirty system build. Unfortunately, no appropriate snapshot of main was available to avoid my personal context being involved for the system build used. Similarly, the snapshot(s) of stable/14 predate: Sun, 03 Sep 2023 . . . git: f789381671a3 - stable/14 - zfs: merge openzfs/zfs@32949f256 = (zfs-2.2-release) into stable/14 that has required fixes for other issues. =3D=3D=3D Mark Millard marklmi at yahoo.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7CE2CAAF-8BB0-4422-B194-4A6B0A4BC12C>