Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 7 Sep 2023 11:17:22 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        Martin Matuska <mm@FreeBSD.org>, Current FreeBSD <freebsd-current@freebsd.org>, FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org>
Cc:        Glen Barber <gjb@FreeBSD.org>, Pawel Jakub Dawidek <pjd@FreeBSD.org>, Alexander Motin <mav@FreeBSD.org>
Subject:   Re: main [and, likely, stable/14]: do not set vfs.zfs.bclone_enabled=1 with that zpool feature enabled because it still leads to panics
Message-ID:  <08B7E72B-78F1-4ACA-B09D-E8C34BCE2335@yahoo.com>
In-Reply-To: <7CE2CAAF-8BB0-4422-B194-4A6B0A4BC12C@yahoo.com>
References:  <7CE2CAAF-8BB0-4422-B194-4A6B0A4BC12C@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
[Drat, the request to rerun my tests did not not mention
the more recent change:

vfs: copy_file_range() between multiple mountpoints of the same fs type

and I'd not noticed on my own and ran the test without updating.]


On Sep 7, 2023, at 11:02, Mark Millard <marklmi@yahoo.com> wrote:

> I was requested to do a test with vfs.zfs.bclone_enabled=3D1 and
> the bulk -a build paniced (having stored 128 *.pkg files in
> .building/ first):

Unfortunately, rerunning my tests with this set was testing a
context predating:

Wed, 06 Sep 2023
. . .
    =E2=80=A2 git: 969071be938c - main - vfs: copy_file_range() between =
multiple mountpoints of the same fs type Martin Matuska

So the information might be out of date for main and for
stable/14 : I've no clue how good of a test it was.

May be some of those I've cc'd would know.

When I next have time, should I retry based on a more recent
vintage of main that includes 969071be938c ?

> # more /var/crash/core.txt.3
> . . .
> Unread portion of the kernel message buffer:
> panic: Solaris(panic): zfs: accessing past end of object 422/1108c16 =
(size=3D2560 access=3D2560+2560)
> cpuid =3D 15
> time =3D 1694103674
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame =
0xfffffe0352758590
> vpanic() at vpanic+0x132/frame 0xfffffe03527586c0
> panic() at panic+0x43/frame 0xfffffe0352758720
> vcmn_err() at vcmn_err+0xeb/frame 0xfffffe0352758850
> zfs_panic_recover() at zfs_panic_recover+0x59/frame 0xfffffe03527588b0
> dmu_buf_hold_array_by_dnode() at =
dmu_buf_hold_array_by_dnode+0x97/frame 0xfffffe0352758960
> dmu_brt_clone() at dmu_brt_clone+0x61/frame 0xfffffe03527589f0
> zfs_clone_range() at zfs_clone_range+0xa6a/frame 0xfffffe0352758bc0
> zfs_freebsd_copy_file_range() at =
zfs_freebsd_copy_file_range+0x1ae/frame 0xfffffe0352758c40
> vn_copy_file_range() at vn_copy_file_range+0x11e/frame =
0xfffffe0352758ce0
> kern_copy_file_range() at kern_copy_file_range+0x338/frame =
0xfffffe0352758db0
> sys_copy_file_range() at sys_copy_file_range+0x78/frame =
0xfffffe0352758e00
> amd64_syscall() at amd64_syscall+0x109/frame 0xfffffe0352758f30
> fast_syscall_common() at fast_syscall_common+0xf8/frame =
0xfffffe0352758f30
> --- syscall (569, FreeBSD ELF64, copy_file_range), rip =3D =
0x1ce4506d155a, rsp =3D 0x1ce44ec71e88, rbp =3D 0x1ce44ec72320 ---
> KDB: enter: panic
>=20
> __curthread () at /usr/main-src/sys/amd64/include/pcpu_aux.h:57
> 57              __asm("movq %%gs:%P1,%0" : "=3Dr" (td) : "n" =
(offsetof(struct pcpu,
> (kgdb) #0  __curthread () at =
/usr/main-src/sys/amd64/include/pcpu_aux.h:57
> #1  doadump (textdump=3Dtextdump@entry=3D0)
>   at /usr/main-src/sys/kern/kern_shutdown.c:405
> #2  0xffffffff804a442a in db_dump (dummy=3D<optimized out>,      =
dummy2=3D<optimized out>, dummy3=3D<optimized out>, dummy4=3D<optimized =
out>)
>   at /usr/main-src/sys/ddb/db_command.c:591
> #3  0xffffffff804a422d in db_command (last_cmdp=3D<optimized out>,     =
 cmd_table=3D<optimized out>, dopager=3Dtrue)
>   at /usr/main-src/sys/ddb/db_command.c:504
> #4  0xffffffff804a3eed in db_command_loop ()
>   at /usr/main-src/sys/ddb/db_command.c:551
> #5  0xffffffff804a7876 in db_trap (type=3D<optimized out>, =
code=3D<optimized out>)
>   at /usr/main-src/sys/ddb/db_main.c:268
> #6  0xffffffff80bb9e57 in kdb_trap (type=3Dtype@entry=3D3, =
code=3Dcode@entry=3D0,      tf=3Dtf@entry=3D0xfffffe03527584d0) at =
/usr/main-src/sys/kern/subr_kdb.c:790
> #7  0xffffffff8104ad3d in trap (frame=3D0xfffffe03527584d0)
>   at /usr/main-src/sys/amd64/amd64/trap.c:608
> #8  <signal handler called>
> #9  kdb_enter (why=3D<optimized out>, msg=3D<optimized out>)
>   at /usr/main-src/sys/kern/subr_kdb.c:556
> #10 0xffffffff80b6aab3 in vpanic (fmt=3D0xffffffff82be52d6 "%s%s",     =
 ap=3Dap@entry=3D0xfffffe0352758700)
>   at /usr/main-src/sys/kern/kern_shutdown.c:958
> #11 0xffffffff80b6a943 in panic (
>   fmt=3D0xffffffff820aa2e8 <vt_conswindow+16> =
"\312C$\201\377\377\377\377")
>   at /usr/main-src/sys/kern/kern_shutdown.c:894
> #12 0xffffffff82993c5b in vcmn_err (ce=3D<optimized out>,      =
fmt=3D0xffffffff82bfdd1f "zfs: accessing past end of object %llx/%llx =
(size=3D%u access=3D%llu+%llu)", adx=3D0xfffffe0352758890)
>   at =
/usr/main-src/sys/contrib/openzfs/module/os/freebsd/spl/spl_cmn_err.c:60
> #13 0xffffffff82a84d69 in zfs_panic_recover (
>   fmt=3D0x12 <error: Cannot access memory at address 0x12>)
>   at /usr/main-src/sys/contrib/openzfs/module/zfs/spa_misc.c:1594
> #14 0xffffffff829f8e27 in dmu_buf_hold_array_by_dnode =
(dn=3D0xfffff813dfc48978,      offset=3Doffset@entry=3D2560, =
length=3Dlength@entry=3D2560, read=3Dread@entry=3D0,      =
tag=3D0xffffffff82bd8175, numbufsp=3Dnumbufsp@entry=3D0xfffffe03527589bc, =
     dbpp=3D0xfffffe03527589c0, flags=3D0)
>   at /usr/main-src/sys/contrib/openzfs/module/zfs/dmu.c:543
> #15 0xffffffff829fc6a1 in dmu_buf_hold_array (os=3D<optimized out>,    =
  object=3D<optimized out>, read=3D0, numbufsp=3D0xfffffe03527589bc,     =
 dbpp=3D0xfffffe03527589c0, offset=3D<optimized out>, length=3D<optimized =
out>,      tag=3D<optimized out>)
>   at /usr/main-src/sys/contrib/openzfs/module/zfs/dmu.c:654
> #16 dmu_brt_clone (os=3Dos@entry=3D0xfffff8010ae0e000, =
object=3D<optimized out>,      offset=3Doffset@entry=3D2560, =
length=3Dlength@entry=3D2560,      tx=3Dtx@entry=3D0xfffff81aaeb6e100, =
bps=3Dbps@entry=3D0xfffffe0595931000, nbps=3D1,      replay=3D0) at =
/usr/main-src/sys/contrib/openzfs/module/zfs/dmu.c:2301
> #17 0xffffffff82b4440a in zfs_clone_range (inzp=3D0xfffff8100054c910,  =
    inoffp=3D0xfffff81910c3c7c8, outzp=3D0xfffff80fb3233000,      =
outoffp=3D0xfffff819860a2c78, lenp=3Dlenp@entry=3D0xfffffe0352758c00,    =
  cr=3D0xfffff80e32335200)
>   at /usr/main-src/sys/contrib/openzfs/module/zfs/zfs_vnops.c:1302
> #18 0xffffffff829b4ece in zfs_freebsd_copy_file_range =
(ap=3D0xfffffe0352758c58)
>   at =
/usr/main-src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c:629=
4
> #19 0xffffffff80c7160e in VOP_COPY_FILE_RANGE (invp=3D<optimized out>, =
     inoffp=3D0x40, outvp=3D0xfffffe03527581d0, =
outoffp=3D0xffffffff811e6eb7,      lenp=3D0x0, flags=3D0, =
incred=3D0xfffff80e32335200, outcred=3D0x0,      =
fsizetd=3D0xfffffe03586c0720) at ./vnode_if.h:2381
> #20 vn_copy_file_range (invp=3Dinvp@entry=3D0xfffff8095e1e8000, =
inoffp=3D0x40,      inoffp@entry=3D0xfffff81910c3c7c8, =
outvp=3D0xfffffe03527581d0,      outvp@entry=3D0xfffff805d6107380, =
outoffp=3D0xffffffff811e6eb7,      outoffp@entry=3D0xfffff819860a2c78, =
lenp=3D0x0,      lenp@entry=3D0xfffffe0352758d50, flags=3Dflags@entry=3D0,=
      incred=3D0xfffff80e32335200, outcred=3D0xfffff80e32335200,      =
fsize_td=3D0xfffffe03586c0720) at =
/usr/main-src/sys/kern/vfs_vnops.c:3085
> #21 0xffffffff80c6b998 in kern_copy_file_range (
>   td=3Dtd@entry=3D0xfffffe03586c0720, infd=3D<optimized out>,      =
inoffp=3D0xfffff81910c3c7c8, inoffp@entry=3D0x0, outfd=3D<optimized =
out>,      outoffp=3D0xfffff819860a2c78, outoffp@entry=3D0x0, =
len=3D9223372036854775807,      flags=3D0) at =
/usr/main-src/sys/kern/vfs_syscalls.c:4971
> #22 0xffffffff80c6bab8 in sys_copy_file_range (td=3D0xfffffe03586c0720, =
     uap=3D0xfffffe03586c0b20) at =
/usr/main-src/sys/kern/vfs_syscalls.c:5009
> #23 0xffffffff8104bab9 in syscallenter (td=3D0xfffffe03586c0720)
>   at /usr/main-src/sys/amd64/amd64/../../kern/subr_syscall.c:187
> #24 amd64_syscall (td=3D0xfffffe03586c0720, traced=3D0)
>   at /usr/main-src/sys/amd64/amd64/trap.c:1197
> #25 <signal handler called>
> #26 0x00001ce4506d155a in ?? ()
> Backtrace stopped: Cannot access memory at address 0x1ce44ec71e88
> (kgdb)=20
>=20
>=20
> Context details follow.
>=20
> Absent a openzfs-2.2 in:
>=20
> ls -C1 /usr/share/zfs/compatibility.d/openzfs-2.*
> /usr/share/zfs/compatibility.d/openzfs-2.0-freebsd
> /usr/share/zfs/compatibility.d/openzfs-2.0-linux
> /usr/share/zfs/compatibility.d/openzfs-2.1-freebsd
> /usr/share/zfs/compatibility.d/openzfs-2.1-linux
>=20
> I have copied:
>=20
> =
/usr/main-src/sys/contrib/openzfs/cmd/zpool/compatibility.d/openzfs-2.2
>=20
> over to:
>=20
> # ls -C1 /etc/zfs/compatibility.d/*
> /etc/zfs/compatibility.d/openzfs-2.2
>=20
> and used it:
>=20
> # zpool get compatibility zamd64
> NAME    PROPERTY       VALUE          SOURCE
> zamd64  compatibility  openzfs-2.2    local
>=20
> For reference:
>=20
> # zpool upgrade
> This system supports ZFS pool feature flags.
>=20
> All pools are formatted using feature flags.
>=20
>=20
> Some supported features are not enabled on the following pools. Once a
> feature is enabled the pool may become incompatible with software
> that does not support the feature. See zpool-features(7) for details.
>=20
> Note that the pool 'compatibility' feature can be used to inhibit
> feature upgrades.
>=20
> POOL  FEATURE
> ---------------
> zamd64
>     redaction_list_spill
>=20
> which agrees with openzfs-2.2 .
>=20
> I did:
>=20
> # sysctl vfs.zfs.bclone_enabled=3D1
> vfs.zfs.bclone_enabled: 0 -> 1
>=20
> I also made a snapshot: zamd64@before-bclone-test and
> I then made a checkpoint. These were establshed just
> after the above enable.
>=20
> I then did a: zpool trim -w zamd64
>=20
> The poudriere bulk command was: poudriere bulk -jmain-amd64-bulk_a -a
> where main-amd64-bulk_a has nothing prebuilt. USE_TMPFS=3Dno
> is in use. No form of ALLOW_MAKE_JOBS is in use. It is a
> 32 builder context (32 hardware threads).
>=20
> For reference:
>=20
> # uname -apKU
> FreeBSD amd64-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT amd64 1500000 #118 =
main-n265152-f49d6f583e9d-dirty: Mon Sep  4 14:26:56 PDT 2023     =
root@amd64_ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.a=
md64/sys/GENERIC-NODBG amd64 amd64 1500000 1500000
>=20
> I'll note that with openzfs-2.1-freebsd compatibility I'd
> previously let such a bulk -a run for about 10 hr and it
> had reached 6366 port->package builds.
>=20
> Prior to that I'd done shorter experiments with default
> zpool features (no explicit compatibility constraint)
> but vfs.zfs.bclone_enabled=3D0 and I'd had no problems.
>=20
> (I have a separate M.2 boot media just for such experiments
> and can reconstruct its content at will.)
>=20
> All these have been based on the same personal
> main-n265152-f49d6f583e9d-dirty system build. Unfortunately,
> no appropriate snapshot of main was available to avoid my
> personal context being involved for the system build used.
> Similarly, the snapshot(s) of stable/14 predate:
>=20
> Sun, 03 Sep 2023
> . . .
> git: f789381671a3 - stable/14 - zfs: merge openzfs/zfs@32949f256 =
(zfs-2.2-release) into stable/14
>=20
> that has required fixes for other issues.




=3D=3D=3D
Mark Millard
marklmi at yahoo.com




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?08B7E72B-78F1-4ACA-B09D-E8C34BCE2335>