Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 8 Sep 2023 15:52:02 +0200
From:      Martin Matuska <mm@FreeBSD.org>
To:        Mark Millard <marklmi@yahoo.com>, Alexander Motin <mav@FreeBSD.org>, Glen Barber <gjb@FreeBSD.org>
Cc:        Current FreeBSD <freebsd-current@freebsd.org>, FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org>, Pawel Jakub Dawidek <pjd@freebsd.org>
Subject:   Re: main [and, likely, stable/14]: do not set vfs.zfs.bclone_enabled=1 with that zpool feature enabled because it still leads to panics
Message-ID:  <15df58d3-4603-132f-112e-d10a6d4419bf@FreeBSD.org>
In-Reply-To: <8B8B3707-4B37-4621-8124-D6A77CAF6879@yahoo.com>
References:  <7CE2CAAF-8BB0-4422-B194-4A6B0A4BC12C@yahoo.com> <08B7E72B-78F1-4ACA-B09D-E8C34BCE2335@yahoo.com> <20230907184823.GC4090@FreeBSD.org> <F4ED7034-6776-402C-8706-DED08F41455E@yahoo.com> <4f4e2b68-57e0-a475-e2bd-1f2b8844ebfe@FreeBSD.org> <354C5B8C-4216-4171-B8C2-8E827817F8E5@yahoo.com> <8B8B3707-4B37-4621-8124-D6A77CAF6879@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
I digged a little and was able to reproduce the panic without poudriere 
with a shell script.

You may want to increase "repeats".
The script causes the panic in dmu_buf_hold_array_by_dnode() on my 
VirtualBox with the cat command on 9th iteration.

Here is the script:

#!/bin/sh
nl='
'
sed_script=s/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb/
for ac_i in 1 2 3 4 5 6 7; do
     sed_script="$sed_script$nl$sed_script"
done
echo "$sed_script" 2>/dev/null | sed 99q >conftest.sed

repeats=8
count=0
echo -n 0123456789 >"conftest.in"
while :
do
     cat "conftest.in" "conftest.in" >"conftest.tmp"
     mv "conftest.tmp" "conftest.in"
     cp "conftest.in" "conftest.nl"
     echo '' >> "conftest.nl"
     sed -f conftest.sed < "conftest.nl" >"conftest.out" 2>/dev/null || 
break
     diff "conftest.out" "conftest.nl" >/dev/null 2>&1 || break
     count=$(($count + 1))
     echo "count: $count"
     # 10*(2^10) chars as input seems more than enough
     test $count -gt $repeats && break
done
rm -f conftest.in conftest.tmp conftest.nl conftest.out


On 8. 9. 2023 6:32, Mark Millard wrote:
> [Today's main-snapshot kernel panics as well.]
>
> On Sep 7, 2023, at 16:32, Mark Millard <marklmi@yahoo.com> wrote:
>
>> On Sep 7, 2023, at 13:07, Alexander Motin <mav@FreeBSD.org> wrote:
>>
>>> Thanks, Mark.
>>>
>>> On 07.09.2023 15:40, Mark Millard wrote:
>>>> On Sep 7, 2023, at 11:48, Glen Barber <gjb@FreeBSD.org> wrote:
>>>>> On Thu, Sep 07, 2023 at 11:17:22AM -0700, Mark Millard wrote:
>>>>>> When I next have time, should I retry based on a more recent
>>>>>> vintage of main that includes 969071be938c ?
>>>>> Yes, please, if you can.
>>>> As stands, I rebooted that machine into my normal
>>>> enviroment, so the after-crash-with-dump-info
>>>> context is preserved. I'll presume lack of a need
>>>> to preserve that context unless I hear otherwise.
>>>> (But I'll work on this until later today.)
>>>> Even my normal environment predates the commit in
>>>> question by a few commits. So I'll end up doing a
>>>> more general round of updates overall.
>>>> Someone can let me know if there is a preference
>>>> for debug over non-debug for the next test run.
>>> It is not unknown when some bugs disappear once debugging is enabled due to different execution timings, but generally debug may to detect the problem closer to its origin instead of looking on random consequences. I am only starting to look on this report (unless Pawel or somebody beat me on it), and don't have additional requests yet, but if you can repeat the same with debug kernel (in-base ZFS's ZFS_DEBUG setting follows kernel's INVARIANTS), it may give us some additional information.
>> So I did a zpool import, rewinding to the checkpoint.
>> (This depends on the questionable zfs doing fully as
>> desired for this. Notably the normal environment has
>> vfs.zfs.bclone_enabled=0 , including when it was
>> doing this activity.) My normal environment reported
>> no problems.
>>
>> Note: the earlier snapshot from my first setup was
>> still in place since it was made just before the
>> original checkpoint used above.
>>
>> However, the rewind did remove the /var/crash/
>> material that had been added.
>>
>> I did the appropriate zfs mount.
>>
>> I installed a debug kernel and world to the import. Again,
>> no problems reported.
>>
>> I did the appropriate zfs umount.
>>
>> I did the appropriate zpool export.
>>
>> I rebooted with the test media.
>>
>> # sysctl vfs.zfs.bclone_enabled
>> vfs.zfs.bclone_enabled: 1
>>
>> # zpool trim -w zamd64
>>
>> # zpool checkpoint zamd64
>>
>> # uname -apKU
>> FreeBSD amd64-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT amd64 1500000 #74 main-n265188-117c54a78ccd-dirty: Tue Sep  5 21:29:53 PDT 2023     root@amd64-ZFS:/usr/obj/BUILDs/main-amd64-dbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-DBG amd64 amd64 1500000 1500000
>>
>> (So, before the 969071be938c vintage, same sources as for
>> my last run but a debug build.)
>>
>> # poudriere bulk -jmain-amd64-bulk_a -a
>> . . .
>> [00:03:23] Building 34214 packages using up to 32 builders
>> [00:03:23] Hit CTRL+t at any time to see build progress and stats
>> [00:03:23] [01] [00:00:00] Builder starting
>> [00:04:19] [01] [00:00:56] Builder started
>> [00:04:20] [01] [00:00:01] Building ports-mgmt/pkg | pkg-1.20.6
>> [00:05:33] [01] [00:01:14] Finished ports-mgmt/pkg | pkg-1.20.6: Success
>> [00:05:53] [01] [00:00:00] Building print/indexinfo | indexinfo-0.3.1
>> [00:05:53] [02] [00:00:00] Builder starting
>> . . .
>> [00:05:54] [32] [00:00:00] Builder starting
>> [00:06:11] [01] [00:00:18] Finished print/indexinfo | indexinfo-0.3.1: Success
>> [00:06:12] [01] [00:00:00] Building devel/gettext-runtime | gettext-runtime-0.22_1
>> [00:08:24] [01] [00:02:12] Finished devel/gettext-runtime | gettext-runtime-0.22_1: Success
>> [00:08:31] [01] [00:00:00] Building devel/libtextstyle | libtextstyle-0.22
>> [00:10:06] [05] [00:04:13] Builder started
>> [00:10:06] [05] [00:00:00] Building devel/autoconf-switch | autoconf-switch-20220527
>> [00:10:06] [31] [00:04:12] Builder started
>> [00:10:06] [31] [00:00:00] Building devel/libatomic_ops | libatomic_ops-7.8.0
>> . . .
>>
>> Crashed again, with 158 *.pkg files in .building/All/ after
>> rebooting.
>>
>> The crash is similar to the non-debug one. No extra output
>> from the debug build.
>>
>> For reference:
>>
>> Unread portion of the kernel message buffer:
>> panic: Solaris(panic): zfs: accessing past end of object 422/10b1c02 (size=2560 access=2560+2560)
>> . . .
> Same world with newer snapshot main kernel that should
> be compatible with the world:
>
> # uname -apKU
> FreeBSD amd64-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT amd64 1500000 #0 main-n265205-03a7c36ddbc0: Thu Sep  7 03:10:34 UTC 2023     root@releng3.nyi.freebsd.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 amd64 1500000 1500000
>
>
> Steps:
>
> #NOTE: (re)boot to normal environment
> #NOTE: login
>
> cd ~/artifacts/
>
> #NOTE: as needed . . .
> fetch http://ftp3.freebsd.org/pub/FreeBSD/snapshots/amd64/15.0-CURRENT/kernel.txz
> fetch http://ftp3.freebsd.org/pub/FreeBSD/snapshots/amd64/15.0-CURRENT/kernel-dbg.txz
> fetch http://ftp3.freebsd.org/pub/FreeBSD/snapshots/amd64/15.0-CURRENT/src.txz
>
> #NOTE: /zamd64-mnt is a pre-existing empty directory for such use:
> zpool import --rewind-to-checkpoint -f -N -R /zamd64-mnt -t zamd64 zpamd64
> zpool checkpoint zpamd64
>
> #NOTE: My BE's for bectl use have mountpoint=none normally.
> zfs set mountpoint=/ zpamd64/ROOT/main-amd64
> zfs mount zpamd64/ROOT/main-amd64
>
> mv /zamd64-mnt/boot/kernel /zamd64-mnt/boot/kernel.mm
> tar -xpf kernel.txz -C /zamd64-mnt/
>
> mv /zamd64-mnt/usr/lib/debug/boot/kernel /zamd64-mnt/usr/lib/debug/boot/kernel.mm
> tar -xpf kernel-dbg.txz -C /zamd64-mnt/
>
> zfs umount zpamd64/ROOT/main-amd64
> zfs set mountpoint=none zpamd64/ROOT/main-amd64
>
> zfs mount zpamd64/usr/src
> #NOTE: /zamd64-mnt/usr/src/ being empty initially
> #NOTE: My normal environment's source is in a worktree /usr/main-src/
> tar -xpf src.txz -C /zamd64-mnt/
> zfs umount zpamd64/usr/src
>
> zpool export zpamd64
>
> #NOTE: reboot to zamd64 #(my test environment)
> #NOTE: login
>
> uname -apKU
>
> sysctl vfs.zfs.bclone_enabled
>
> zpool trim -w zamd64
> poudriere bulk -jmain-amd64-bulk_a -a
>
> Paniced.
>
> #NOTE: dump
> #NOTE: reboot to zamd64 #(my test environment)
> #NOTE: login
>
> ls -Tlod /usr/local/poudriere/data/packages/main-amd64-bulk_a-default/.building/All/*.pkg | wc -l
>       122
>
> less /var/crash/core.txt.3
> . . .
> Unread portion of the kernel message buffer:
> panic: Solaris(panic): zfs: accessing past end of object 422/10d670b (size=2560 access=2560+2560)
> cpuid = 18
> time = 1694146387
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe033f721590
> vpanic() at vpanic+0x132/frame 0xfffffe033f7216c0
> panic() at panic+0x43/frame 0xfffffe033f721720
> vcmn_err() at vcmn_err+0xeb/frame 0xfffffe033f721850
> zfs_panic_recover() at zfs_panic_recover+0x59/frame 0xfffffe033f7218b0
> dmu_buf_hold_array_by_dnode() at dmu_buf_hold_array_by_dnode+0xb8/frame 0xfffffe033f721960
> dmu_brt_clone() at dmu_brt_clone+0x61/frame 0xfffffe033f7219e0
> zfs_clone_range() at zfs_clone_range+0xaa3/frame 0xfffffe033f721bb0
> zfs_freebsd_copy_file_range() at zfs_freebsd_copy_file_range+0x18a/frame 0xfffffe033f721c30
> vn_copy_file_range() at vn_copy_file_range+0x163/frame 0xfffffe033f721ce0
> kern_copy_file_range() at kern_copy_file_range+0x36c/frame 0xfffffe033f721db0
> sys_copy_file_range() at sys_copy_file_range+0x78/frame 0xfffffe033f721e00
> amd64_syscall() at amd64_syscall+0x138/frame 0xfffffe033f721f30
> fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe033f721f30
> --- syscall (569, FreeBSD ELF64, copy_file_range), rip = 0x2c05a58a355a, rsp = 0x2c05a4312138, rbp = 0x2c05a43125d0 ---
> KDB: enter: panic
>
> __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
> 57              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
> (kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
> #1  doadump (textdump=textdump@entry=0)
>      at /usr/src/sys/kern/kern_shutdown.c:405
> #2  0xffffffff804a401a in db_dump (dummy=<optimized out>,      dummy2=<optimized out>, dummy3=<optimized out>, dummy4=<optimized out>)
>      at /usr/src/sys/ddb/db_command.c:591
> #3  0xffffffff804a3e1d in db_command (last_cmdp=<optimized out>,      cmd_table=<optimized out>, dopager=true)
>      at /usr/src/sys/ddb/db_command.c:504
> #4  0xffffffff804a3add in db_command_loop ()
>      at /usr/src/sys/ddb/db_command.c:551
> #5  0xffffffff804a71b6 in db_trap (type=<optimized out>, code=<optimized out>)
>      at /usr/src/sys/ddb/db_main.c:268
> #6  0xffffffff80b9dfb3 in kdb_trap (type=type@entry=3, code=code@entry=0,      tf=tf@entry=0xfffffe033f7214d0) at /usr/src/sys/kern/subr_kdb.c:790
> #7  0xffffffff8104d579 in trap (frame=0xfffffe033f7214d0)
>      at /usr/src/sys/amd64/amd64/trap.c:608
> #8  <signal handler called>
> #9  kdb_enter (why=<optimized out>, msg=<optimized out>)
>      at /usr/src/sys/kern/subr_kdb.c:556
> #10 0xffffffff80b4f683 in vpanic (fmt=0xffffffff8223d54e "%s%s",      ap=ap@entry=0xfffffe033f721700) at /usr/src/sys/kern/kern_shutdown.c:958
> #11 0xffffffff80b4f463 in panic (
>      fmt=0xffffffff8196c800 <cnputs_mtx> "$\222\024\201\377\377\377\377")
>      at /usr/src/sys/kern/kern_shutdown.c:894
> #12 0xffffffff81f9964b in vcmn_err (ce=<optimized out>,      fmt=0xffffffff82274462 "zfs: accessing past end of object %llx/%llx (size=%u access=%llu+%llu)", adx=0xfffffe033f721890)
>      at /usr/src/sys/contrib/openzfs/module/os/freebsd/spl/spl_cmn_err.c:60
> #13 0xffffffff820b6a59 in zfs_panic_recover (
>      fmt=0x12 <error: Cannot access memory at address 0x12>)
>      at /usr/src/sys/contrib/openzfs/module/zfs/spa_misc.c:1594
> #14 0xffffffff820150d8 in dmu_buf_hold_array_by_dnode (dn=0xfffff802c3c41be8,      offset=offset@entry=2560, length=length@entry=2560, read=read@entry=0,      tag=0xffffffff8222139b, numbufsp=numbufsp@entry=0xfffffe033f7219ac,      dbpp=0xfffffe033f7219b0, flags=0)
>      at /usr/src/sys/contrib/openzfs/module/zfs/dmu.c:543
> #15 0xffffffff82018c31 in dmu_buf_hold_array (os=<optimized out>,      object=<optimized out>, read=0, numbufsp=0xfffffe033f7219ac,      dbpp=0xfffffe033f7219b0, offset=<optimized out>, length=<optimized out>,      tag=<optimized out>) at /usr/src/sys/contrib/openzfs/module/zfs/dmu.c:654
> #16 dmu_brt_clone (os=os@entry=0xfffff8019696b800, object=<optimized out>,      offset=offset@entry=2560, length=length@entry=2560,      tx=tx@entry=0xfffff81433e03d00, bps=bps@entry=0xfffffe0442ce5000, nbps=1,      replay=0) at /usr/src/sys/contrib/openzfs/module/zfs/dmu.c:2301
> #17 0xffffffff82183333 in zfs_clone_range (inzp=0xfffff80e37f64000,      inoffp=0xfffff8019ab32b38, outzp=0xfffff80d931a5910,      outoffp=0xfffff80662d0a548, lenp=lenp@entry=0xfffffe033f721bf0,      cr=0xfffff8051d1c4800)
>      at /usr/src/sys/contrib/openzfs/module/zfs/zfs_vnops.c:1302
> #18 0xffffffff81fbc76a in zfs_freebsd_copy_file_range (ap=0xfffffe033f721c48)
>      at /usr/src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c:6294
> #19 0xffffffff80c612a3 in VOP_COPY_FILE_RANGE (invp=0xfffff80eb09c08c0,      inoffp=0xfffff8019ab32b38, outvp=0xfffff80e7f998000,      outoffp=0xfffff80662d0a548, lenp=0xfffffe033f721d40,      incred=0xfffff8051d1c4800, flags=<optimized out>,      outcred=<optimized out>, fsizetd=<optimized out>) at ./vnode_if.h:2385
> #20 vn_copy_file_range (invp=invp@entry=0xfffff80eb09c08c0,      inoffp=inoffp@entry=0xfffff8019ab32b38,      outvp=outvp@entry=0xfffff80e7f998000, outoffp=0xfffff80662d0a548,      lenp=lenp@entry=0xfffffe033f721d40, flags=flags@entry=0,      incred=0xfffff8051d1c4800, outcred=0xfffff8051d1c4800,      fsize_td=0xfffffe033fdf1c80) at /usr/src/sys/kern/vfs_vnops.c:3087
> #21 0xffffffff80c5be9c in kern_copy_file_range (
>      td=td@entry=0xfffffe033fdf1c80, infd=<optimized out>,      inoffp=0xfffff8019ab32b38, inoffp@entry=0x0, outfd=<optimized out>,      outoffp=0xffffffff8120f47f, outoffp@entry=0x0, len=9223372036854775807,      flags=0) at /usr/src/sys/kern/vfs_syscalls.c:4971
> #22 0xffffffff80c5bfa8 in sys_copy_file_range (td=0xfffffe033fdf1c80,      uap=0xfffffe033fdf2080) at /usr/src/sys/kern/vfs_syscalls.c:5009
> #23 0xffffffff8104e3d8 in syscallenter (td=<optimized out>)
>      at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:187
> #24 amd64_syscall (td=0xfffffe033fdf1c80, traced=0)
>      at /usr/src/sys/amd64/amd64/trap.c:1197
> #25 <signal handler called>
> #26 0x00002c05a58a355a in ?? ()
> Backtrace stopped: Cannot access memory at address 0x2c05a4312138
> (kgdb)
> . . .
>
>
> So the problem is not special to my personal kernel builds.
>
>
>>>> Looking at "git: 969071be938c - main", the relevant
>>>> part seems to be just (white space possibly not
>>>> preserved accurately):
>>>> diff --git a/sys/kern/vfs_vnops.c b/sys/kern/vfs_vnops.c
>>>> index 9fb5aee6a023..4e4161ef1a7f 100644
>>>> --- a/sys/kern/vfs_vnops.c
>>>> +++ b/sys/kern/vfs_vnops.c
>>>> @@ -3076,12 +3076,14 @@ vn_copy_file_range(struct vnode *invp, off_t *inoffp, struct vnode *outvp,
>>>>   goto out;
>>>>     /*
>>>> - * If the two vnode are for the same file system, call
>>>> + * If the two vnodes are for the same file system type, call
>>>>   * VOP_COPY_FILE_RANGE(), otherwise call vn_generic_copy_file_range()
>>>> - * which can handle copies across multiple file systems.
>>>> + * which can handle copies across multiple file system types.
>>>>   */
>>>>   *lenp = len;
>>>> - if (invp->v_mount == outvp->v_mount)
>>>> + if (invp->v_mount == outvp->v_mount ||
>>>> +    strcmp(invp->v_mount->mnt_vfc->vfc_name,
>>>> +    outvp->v_mount->mnt_vfc->vfc_name) == 0)
>>>>   error = VOP_COPY_FILE_RANGE(invp, inoffp, outvp, outoffp,
>>>>      lenp, flags, incred, outcred, fsize_td);
>>>>   else
>>>> That looks to call VOP_COPY_FILE_RANGE in more contexts and
>>>> vn_generic_copy_file_range in fewer.
>>>> The backtrace I reported involves: VOP_COPY_FILE_RANGE
>>>> So it appears this change is unlikely to invalidate my
>>>> test result,  although failure might happen sooner if
>>>> more VOP_COPY_FILE_RANGE calls happen with the newer code.
>>> Your logic is likely right, but if you have block cloning requests both within and between datasets, this patch may change the pattern.  Though it is obviously not a fix for the issue.  I responded to the commit email only because it makes no difference while vfs.zfs.bclone_enabled is 0.
>>>
>>>> That in turns means that someone may come up with some
>>>> other change for me to test by the time I get around to
>>>> setting up another test. Let me know if so.
>>> . . .
>
> ===
> Mark Millard
> marklmi at yahoo.com
>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?15df58d3-4603-132f-112e-d10a6d4419bf>