From nobody Sat Aug 16 01:43:17 2025 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4c3hZq0JqXz64csW for ; Sat, 16 Aug 2025 01:43:31 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4c3hZn4mLfz3R0x for ; Sat, 16 Aug 2025 01:43:29 +0000 (UTC) (envelope-from kostikbel@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=gmail.com (policy=none); spf=softfail (mx1.freebsd.org: 2001:470:d5e7:1::1 is neither permitted nor denied by domain of kostikbel@gmail.com) smtp.mailfrom=kostikbel@gmail.com Received: from tom.home (kib@localhost [127.0.0.1] (may be forged)) by kib.kiev.ua (8.18.1/8.18.1) with ESMTP id 57G1hHcV044787; Sat, 16 Aug 2025 04:43:20 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 57G1hHcV044787 Received: (from kostik@localhost) by tom.home (8.18.1/8.18.1/Submit) id 57G1hHdj044786; Sat, 16 Aug 2025 04:43:17 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 16 Aug 2025 04:43:17 +0300 From: Konstantin Belousov To: Rick Macklem Cc: Bakul Shah , FreeBSD Current Subject: Re: zfs related panic Message-ID: References: <14C5523D-1F66-434F-A4D9-E14DA4BBF1E9@iitbombay.org> List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=4.0.1 X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-26) on tom.home X-Spamd-Bar: -- X-Spamd-Result: default: False [-2.99 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-0.999]; NEURAL_HAM_MEDIUM(-0.99)[-0.988]; DMARC_POLICY_SOFTFAIL(0.10)[gmail.com : No valid SPF, No valid DKIM,none]; MIME_GOOD(-0.10)[text/plain]; TAGGED_RCPT(0.00)[]; ARC_NA(0.00)[]; RCVD_TLS_LAST(0.00)[]; MISSING_XM_UA(0.00)[]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US]; FREEMAIL_FROM(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+]; FREEMAIL_TO(0.00)[gmail.com]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; FROM_HAS_DN(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; HAS_XAW(0.00)[]; R_SPF_SOFTFAIL(0.00)[~all]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; MLMMJ_DEST(0.00)[freebsd-current@freebsd.org]; RCPT_COUNT_THREE(0.00)[3] X-Rspamd-Queue-Id: 4c3hZn4mLfz3R0x On Fri, Aug 15, 2025 at 06:14:39PM -0700, Rick Macklem wrote: > On Fri, Aug 15, 2025 at 5:58 PM Konstantin Belousov wrote: > > > > On Fri, Aug 15, 2025 at 05:41:26PM -0700, Rick Macklem wrote: > > > On Fri, Aug 15, 2025 at 5:35 PM Konstantin Belousov wrote: > > > > > > > > On Fri, Aug 15, 2025 at 05:26:21PM -0700, Rick Macklem wrote: > > > > > On Fri, Aug 15, 2025 at 5:07 PM Konstantin Belousov wrote: > > > > > > > > > > > > On Fri, Aug 15, 2025 at 04:51:00PM -0700, Bakul Shah wrote: > > > > > > > On Aug 15, 2025, at 3:51 PM, Konstantin Belousov wrote: > > > > > > > > > > > > > > > > On Fri, Aug 15, 2025 at 11:19:55AM -0700, Bakul Shah wrote: > > > > > > > >> Is this a known bug or may be something specific on my machine? > > > > > > > >> If the latter, any way to "fsck" it? FYI, the zpool is a mirror > > > > > > > >> (two files on the host via nvme). built from c992ac621327 commit hash > > > > > > > >> (which has other issues but they seem to be separate from this). > > > > > > > >> I saw the same panic when I booted from a day old snapshot. > > > > > > > >> > > > > > > > >> Note that "ls /.zfs" panics but "ls /.zfs/snapshot" doesn't! > > > > > > > >> > > > > > > > >> This is on a -current VM: > > > > > > > >> > > > > > > > >> root@:/ # ls .zfs > > > > > > > >> VNASSERT failed: oresid == 0 || nresid != oresid || *(a)->a_eofflag == 1 not true at vnode_if.c:1824 (VOP_READDIR_APV) > > > > > > > > > > > > > > > > Try this, untested. > > > > > > > > > > > > > > Thanks for the quick patch! But I am afraid it didn't help. Let me know if you > > > > > > > want me to check things via gdb. [I have filed > > > > > > > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=288889 > > > > > > > so we can continue debugging there] > > > > > > > > > > > > > > On the console (single user, RO root): > > > > > > > # ls /.zfs > > > > > > > VNASSERT failed: oresid == 0 || nresid != oresid || *(a)->a_eofflag == 1 not true at vnode_if.c:1824 (VOP_READDIR_APV) > > > > > > > 0xfffff800059546e0: type VDIR state VSTATE_CONSTRUCTED op 0xffffffff8272cfd0 > > > > > > > usecount 1, writecount 0, refcount 1 seqc users 0 mountedhere 0 > > > > > > > hold count flags () > > > > > > > flags () > > > > > > > lock type zfs: SHARED (count 1) > > > > > > > name = .zfs > > > > > > > parent_id = 0 > > > > > > > id = 1 > > > > > > > panic: VOP_READDIR: eofflag not set > > > > > > > cpuid = 0 > > > > > > > time = 1755276357 > > > > > > > KDB: stack backtrace: > > > > > > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0053f83af0 > > > > > > > vpanic() at vpanic+0x136/frame 0xfffffe0053f83c20 > > > > > > > panic() at panic+0x43/frame 0xfffffe0053f83c80 > > > > > > > VOP_READDIR_APV() at VOP_READDIR_APV+0x205/frame 0xfffffe0053f83cd0 > > > > > > > kern_getdirentries() at kern_getdirentries+0x228/frame 0xfffffe0053f83dd0 > > > > > > > sys_getdirentries() at sys_getdirentries+0x29/frame 0xfffffe0053f83e00 > > > > > > > amd64_syscall() at amd64_syscall+0x169/frame 0xfffffe0053f83f30 > > > > > > > fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0053f83f30 > > > > > > > --- syscall (554, FreeBSD ELF64, getdirentries), rip = 0x331339f976aa, rsp = 0x33133631ade8, rbp = 0x33133631ae20 --- > > > > > > > KDB: enter: panic > > > > > > > [ thread pid 23 tid 100211 ] > > > > > > > Stopped at kdb_enter+0x33: movq $0,0x12313e2(%rip) > > > > > > > db> > > > > > > > > > > > > > > Running gdb on the host (attached to tcp port): > > > > > > > #16 0xffffffff80b7992b in vpanic ( > > > > > > > fmt=0xffffffff812ddf30 "VOP_READDIR: eofflag not set", > > > > > > > ap=ap@entry=0xfffffe0053f83c60) > > > > > > > at /home/FreeBSD/current/sys/kern/kern_shutdown.c:962 > > > > > > > #17 0xffffffff80b79793 in panic ( > > > > > > > fmt=0xffffffff81d9eab0 "\304\372\032\201\377\377\377\377") > > > > > > > at /home/FreeBSD/current/sys/kern/kern_shutdown.c:887 > > > > > > > #18 0xffffffff81195fd5 in VOP_READDIR_APV (vop=, > > > > > > > a=a@entry=0xfffffe0053f83d30) at vnode_if.c:1824 > > > > > > > #19 0xffffffff80c95e58 in VOP_READDIR (vp=0xfffff800059546e0, > > > > > > > uio=0xfffffe0053f83d00, cred=, eofflag=0xfffffe0053f83d6c, > > > > > > > ncookies=0x0, cookies=0x0) at ./vnode_if.h:972 > > > > > > From this frame, do > > > > > > p *vp > > > > > > and > > > > > > p *(vp->v_op) > > > > > > I am mostly interested what is the .vop_readdir fp points to. > > > > > I think the problem is that, for this case, ZFS replies with eofflag > > > > > == -1 instead > > > > > of 1. (I don't know if you want to change the ASSERT or try to fix ZFS > > > > > to not do this?) > > > > > > > > Where do you see it? I mean the '-1' set to *eofp. > > > I saw it in a printf() after VOP_READDIR(). However, a subsequent > > > test showed 0. > > > --> The first time I was printing out for non-ZFS, so there was a fair amount > > > of other printf()s being logged. Then I limited it to ZFS. > > > > > > Anyhow, ZFS seems to get eof wrong when it is already at eof. > > > > > > I'll take a look at the ZFS code, rick > > > > > > > Below is what I gathered so far. > > > > diff --git a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_ctldir.c b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_ctldir.c > > index 61d0bb26d1e5..c191b2abf6cc 100644 > > --- a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_ctldir.c > > +++ b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_ctldir.c > > @@ -674,6 +674,7 @@ zfsctl_root_readdir(struct vop_readdir_args *ap) > > zfs_uio_t uio; > > int *eofp = ap->a_eofflag; > > off_t dots_offset; > > + ssize_t orig_resid; > > int error; > > > > zfs_uio_init(&uio, ap->a_uio); > > @@ -688,14 +689,20 @@ zfsctl_root_readdir(struct vop_readdir_args *ap) > > * count to return is 0. > > */ > > if (zfs_uio_offset(&uio) == 3 * sizeof (entry)) { > > + if (eofp != NULL) > > + *eofp = 1; > Yea, I was just going to try the same thing. I think it probably fixes > the assert. > (Of course, it might take a while to get this in ZFS, so you might have to drop > the assert until it downstreams from OpenZFS?) > > > return (0); > > } > > > > + orig_resid = zfs_uio_resid(&uio); > > error = sfs_readdir_common(zfsvfs->z_root, ZFSCTL_INO_ROOT, ap, &uio, > > &dots_offset); > > if (error != 0) { > > - if (error == ENAMETOOLONG) /* ran out of destination space */ > > + if (error == ENAMETOOLONG) { /* ran out of destination space */ > > error = 0; > > + if (orig_resid == zfs_uio_resid(&uio) && eofp != NULL) > > + *eofp = 1; > If it hasn't filled in an entry, shouldn't it be an error return and not > setting "error = 0;"? This is how this code is structured: if there is no space, consider it the end of the run. UFS does the same, but it is careful to only reset error to zero if there were some entries already copied out. > > > + } > > return (error); > > } > > if (zfs_uio_offset(&uio) != dots_offset) > > @@ -710,8 +717,11 @@ zfsctl_root_readdir(struct vop_readdir_args *ap) > > entry.d_reclen = sizeof (entry); > > error = vfs_read_dirent(ap, &entry, zfs_uio_offset(&uio)); > > if (error != 0) { > > - if (error == ENAMETOOLONG) > > + if (error == ENAMETOOLONG) { > > error = 0; > > + if (orig_resid == zfs_uio_resid(&uio) && eofp != NULL) > > + *eofp = 1; > Ditto above. > > + } > > return (SET_ERROR(error)); > > } > > if (eofp != NULL) > > @@ -1056,17 +1066,22 @@ zfsctl_snapdir_readdir(struct vop_readdir_args *ap) > > zfs_uio_t uio; > > int *eofp = ap->a_eofflag; > > off_t dots_offset; > > + ssize_t orig_resid; > > int error; > > > > zfs_uio_init(&uio, ap->a_uio); > > + orig_resid = zfs_uio_resid(&uio); > > > > ASSERT3S(vp->v_type, ==, VDIR); > > > > error = sfs_readdir_common(ZFSCTL_INO_ROOT, ZFSCTL_INO_SNAPDIR, ap, > > &uio, &dots_offset); > > if (error != 0) { > > - if (error == ENAMETOOLONG) /* ran out of destination space */ > > + if (error == ENAMETOOLONG) { /* ran out of destination space */ > > error = 0; > > + if (orig_resid == zfs_uio_resid(&uio) && eofp != NULL) > > + *eofp = 1; > > + } > Ditto again. > > return (error); > > } > > > > @@ -1084,7 +1099,8 @@ zfsctl_snapdir_readdir(struct vop_readdir_args *ap) > > dsl_pool_config_exit(dmu_objset_pool(zfsvfs->z_os), FTAG); > > if (error != 0) { > > if (error == ENOENT) { > > - if (eofp != NULL) > > + if (orig_resid == zfs_uio_resid(&uio) && > > + eofp != NULL) > > *eofp = 1; > > error = 0; > > } > > @@ -1099,8 +1115,12 @@ zfsctl_snapdir_readdir(struct vop_readdir_args *ap) > > entry.d_reclen = sizeof (entry); > > error = vfs_read_dirent(ap, &entry, zfs_uio_offset(&uio)); > > if (error != 0) { > > - if (error == ENAMETOOLONG) > > + if (error == ENAMETOOLONG) { > > error = 0; > > + if (orig_resid == zfs_uio_resid(&uio) && > > + eofp != NULL) > > + *eofp = 1; > Ditto again. > > + } > > zfs_exit(zfsvfs, FTAG); > > return (SET_ERROR(error)); > > } > > diff --git a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c > > index 1813c411b013..79e808a3ab3d 100644 > > --- a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c > > +++ b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c > > @@ -1695,6 +1695,7 @@ zfs_readdir(vnode_t *vp, zfs_uio_t *uio, cred_t *cr, int *eofp, > > objset_t *os; > > caddr_t outbuf; > > size_t bufsize; > > + ssize_t orig_resid; > > zap_cursor_t zc; > > zap_attribute_t *zap; > > uint_t bytes_wanted; > > @@ -1735,7 +1736,7 @@ zfs_readdir(vnode_t *vp, zfs_uio_t *uio, cred_t *cr, int *eofp, > > /* > > * Quit if directory has been removed (posix) > > */ > > - if ((*eofp = zp->z_unlinked) != 0) { > > + if ((*eofp = (zp->z_unlinked != 0)) != 0) { > > zfs_exit(zfsvfs, FTAG); > > return (0); > > } > > @@ -1743,6 +1744,7 @@ zfs_readdir(vnode_t *vp, zfs_uio_t *uio, cred_t *cr, int *eofp, > > error = 0; > > os = zfsvfs->z_os; > > offset = zfs_uio_offset(uio); > > + orig_resid = zfs_uio_resid(uio); > > prefetch = zp->z_zn_prefetch; > > zap = zap_attribute_long_alloc(); > > > > @@ -1933,6 +1935,8 @@ zfs_readdir(vnode_t *vp, zfs_uio_t *uio, cred_t *cr, int *eofp, > > *cookies = NULL; > > *ncookies = 0; > > } > > + if (error == 0 && zfs_uio_resid(uio) != orig_resid) > > + *eofp = 1; > I looked at this earlier and I don't this is needed. When it breaks out > of the loop, it has set eof if it got ENOENT from zap_cursor_retrieve(), > which I think is the only case that will return 0 and not an error? > (But it looks harmless to me.) Ok, I reverted this chunk, but also changed the case of no dirents to be same as UFS. The patches are moved to https://reviews.freebsd.org/D51930 > > > return (error); > > } > > > >