From nobody Tue Apr 14 19:42:05 2026 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4fwF6g0vGPz6ZKwD for ; Tue, 14 Apr 2026 19:42:35 +0000 (UTC) (envelope-from janm@transactionware.com) Received: from mail3.transactionware.com (mail.transactionware.com [203.14.245.7]) by mx1.freebsd.org (Postfix) with SMTP id 4fwF6f1bLnz41C4 for ; Tue, 14 Apr 2026 19:42:33 +0000 (UTC) (envelope-from janm@transactionware.com) Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of janm@transactionware.com designates 203.14.245.7 as permitted sender) smtp.mailfrom=janm@transactionware.com Received: (qmail 18108 invoked by uid 907); 14 Apr 2026 19:42:20 -0000 Received: from p579b0a39.dip0.t-ipconnect.de (HELO smtpclient.apple) (87.155.10.57) (smtp-auth username janm, mechanism plain) by mail3.transactionware.com (qpsmtpd/0.84) with (ECDHE-RSA-AES256-GCM-SHA384 encrypted) ESMTPSA; Wed, 15 Apr 2026 05:42:20 +1000 From: Jan Martin Mikkelsen Message-Id: <019D4A17-B8AF-4768-ADC4-9BDF80BF566B@transactionware.com> Content-Type: multipart/alternative; boundary="Apple-Mail=_2599E8D2-7F1C-46BE-B974-E34D442877AA" List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3864.500.181\)) Subject: Re: Panic: cache_vop_rename: lingering negative entry Date: Tue, 14 Apr 2026 21:42:05 +0200 In-Reply-To: <8E5A88B4-1EDC-415B-BF35-45AED0B18042@transactionware.com> Cc: current@freebsd.org To: Konstantin Belousov References: <2016260A-5C07-45EE-87CA-73918BA16E83@transactionware.com> <44E3FE9A-4244-49EB-97E0-16080B68F12B@transactionware.com> <0610AE32-DD37-401B-BA04-8C092D61C8B3@transactionware.com> <8E5A88B4-1EDC-415B-BF35-45AED0B18042@transactionware.com> X-Mailer: Apple Mail (2.3864.500.181) X-Spamd-Result: default: False [-2.33 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-0.998]; NEURAL_HAM_SHORT(-0.96)[-0.961]; R_SPF_ALLOW(-0.20)[+ip4:203.14.245.0/24:c]; NEURAL_HAM_LONG(-0.17)[-0.167]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; RCVD_NO_TLS_LAST(0.10)[]; R_DKIM_NA(0.00)[]; ASN(0.00)[asn:17559, ipnet:203.14.245.0/24, country:AU]; ARC_NA(0.00)[]; RCVD_COUNT_ONE(0.00)[1]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; MLMMJ_DEST(0.00)[current@freebsd.org]; RCVD_VIA_SMTP_AUTH(0.00)[]; DMARC_NA(0.00)[transactionware.com]; MID_RHS_MATCH_FROM(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_SOME(0.00)[]; APPLE_MAILER_COMMON(0.00)[]; FROM_HAS_DN(0.00)[] X-Rspamd-Queue-Id: 4fwF6f1bLnz41C4 X-Spamd-Bar: -- --Apple-Mail=_2599E8D2-7F1C-46BE-B974-E34D442877AA Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On 14. Apr 2026, at 17:57, Jan Martin Mikkelsen = wrote: >=20 >=20 >> On 14 Apr 2026, at 11:52, Konstantin Belousov = wrote: >>=20 >> On Tue, Apr 14, 2026 at 11:45:08AM +0200, Jan Martin Mikkelsen wrote: >>>=20 >>>> On 13 Apr 2026, at 22:13, Konstantin Belousov = wrote: >>>>=20 >>>> On Mon, Apr 13, 2026 at 07:12:32PM +0200, Jan Martin Mikkelsen = wrote: >>>>>=20 >>>>>> On 7 Apr 2026, at 20:20, Jan Martin Mikkelsen = wrote: >>>>>>=20 >>>>>> On 7 Apr 2026, at 18:53, Konstantin Belousov = wrote: >>>>>>>=20 >>>>>>> On Tue, Apr 07, 2026 at 05:02:05PM +0200, Jan Martin Mikkelsen = wrote: >>>>>>>> Hi, >>>>>>>>=20 >>>>>>>> I am consistently getting the panic below while building = lang/perl5.42. This is the command from the perl build that triggers the = panic: >>>>>>>>=20 >>>>>>>> /usr/bin/strip = /ports-work/usr/ports/lang/perl5.42/work/stage/usr/local/bin/perl5.42.0 >>>>>>>>=20 >>>>>>>> CURRENT on aarch64, with a kernel from last week, also with a = later one from the weekend. A kernel from mid-January worked fine. >>>>>>>>=20 >>>>>>>> I can reproduce on demand, no parallelism in the build = required. >>>>>>>>=20 >>>>>>>> Does this look familiar to anyone? >>>>>>>>=20 >>>>>>>> panic: cache_vop_rename: lingering negative entry >>>>>>>> cpuid =3D 4 >>>>>>>> time =3D 1775410763 >>>>>>>> KDB: stack backtrace: >>>>>>>> db_trace_self() at db_trace_self >>>>>>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x38 >>>>>>>> vpanic() at vpanic+0x1a0 >>>>>>>> panic() at panic+0x48 >>>>>>>> cache_vop_rename() at cache_vop_rename+0xb0 >>>>>>>> zfs_do_rename() at zfs_do_rename+0xafc >>>>>>>> zfs_freebsd_rename() at zfs_freebsd_rename+0x5c >>>>>>>> VOP_RENAME_APV() at VOP_RENAME_APV+0x44 >>>>>>>> kern_renameat () at kern_renameat+0x574 >>>>>>>> do_el0_sync() at do_el0_sync+0x5f8 >>>>>>>> handle_el0_sync() at handle_el0_sync+0x4c >>>>>>>> --- exception, esr 0x56000000 >>>>>>>> KDB: enter: panic >>>>>>>> [ thread pid 81230 tid 101738 ] >>>>>>>> Stopped at kdb_enter+0x48: str xzr, [x19, #3072] >>>>>>>=20 >>>>>>> Is it reproducable on UFS and/or tmpfs? >>>>>>=20 >>>>>> Successful completion (no panic) when the work directory is on = UFS, and when the work directory is on tmpfs. I didn=E2=80=99t try = multiple times, but it never works on ZFS. >>>>>=20 >>>>> The panic consistently reproduces on a ZFS filesystem with the = properties =E2=80=9Cutf8only=3Don=E2=80=9D and "normalization=3DformD=E2=80= =9D. >>>>>=20 >>>>> A ZFS file system with =E2=80=9Cutf8only=3Doff=E2=80=9D and = "normalization=3Dnone=E2=80=9D works fine. >>>>>=20 >>>>> As far as I can see, strip makes a simple rename(2) call, and = testing rename(2) works fine (as expected). Running the same strip = command on the same files on a fresh system works fine. >>>>>=20 >>>>> The smallest reproducer I have at the moment is building = lang/perl5.42.0 with a workdir on a ZFS filesystem enforcing UTF8. >>>>=20 >>>> I am now sure that the reason is that the options you used cause = the same >>>> inode to have more than one name (but not hardlinks). I remember = that >>>> zfs had option to be case-insensitive, but I may mis-remember. >>>>=20 >>>> The solution, in any case, is to either stop using namecache when = these >>>> options are activated, or at least purge all cached entries that = has the >>>> given dst when the dst vnode is renamed or deleted. >>>>=20 >>>> Somebody who knows zfs would be needed to make the change. >>>=20 >>> I had a look at the ZFS source, and found this: >>>=20 >>> /* >>> * Only use the name cache if we are looking for a >>> * name on a file system that does not require normalization >>> * or case folding. We can also look there if we happen to be >>> * on a non-normalizing, mixed sensitivity file system IF we >>> * are looking for the exact name (which is always the case on >>> * FreeBSD). >>> */ >>> zfsvfs->z_use_namecache =3D !zfsvfs->z_norm || >>> ((zfsvfs->z_case =3D=3D ZFS_CASE_MIXED) && >>> !(zfsvfs->z_norm & ~U8_TEXTPREP_TOUPPER)); >>>=20 >>>=20 >>> The call to cache_vop_rename() which causes the panic is not = protected by an =E2=80=9Cif (zfsvfs->z_use_namecache)=E2=80=9D, unlike = the rest of the code that uses that to decide whether or not to use the = namecache. >>>=20 >>> Elsewhere in zfs_vnops_os.c, there is another call to a cache_vop* = function, which is protected by a test: >>>=20 >>> if (zfsvfs->z_use_namecache) >>> cache_vop_rmdir(dvp, vp); >>>=20 >>> It seems to me that this patch could resolve the problem. Does this = seem reasonable? >>>=20 >>> --- a/src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c = 2026-03-28 20:55:06.000000000 1100 >>> +++ b/src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c = 2026-03-28 20:55:06.000000000 1100 >>> @@ -3524,7 +3524,7 @@ >>> ZRENAMING, NULL)); >>> } >>> } >>> - if (error =3D=3D 0) { >>> + if (error =3D=3D 0 && zfsvfs->z_use_namecache) { >>> cache_vop_rename(sdvp, *svpp, tdvp, *tvpp, scnp, = tcnp); >>> } >>> } >>>=20 >>=20 >> Yes, but please test. >> If works for you, please either create a Github PR or a review on the >> FreeBSD' phab. >=20 > That does seem to fix the panic. I=E2=80=99ll do a GitHub PR. Thanks = for your help. https://github.com/openzfs/zfs/pull/18430 --Apple-Mail=_2599E8D2-7F1C-46BE-B974-E34D442877AA Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8
On= 14. Apr 2026, at 17:57, Jan Martin Mikkelsen = <janm@transactionware.com> wrote:


On 14 Apr 2026, at 11:52, Konstantin Belousov = <kib@freebsd.org> wrote:

On Tue, Apr 14, 2026 at 11:45:08AM = +0200, Jan Martin Mikkelsen wrote:

On 13 Apr 2026, at 22:13, = Konstantin Belousov <kib@freebsd.org> wrote:

On Mon, Apr = 13, 2026 at 07:12:32PM +0200, Jan Martin Mikkelsen wrote:

On 7 Apr 2026, at 20:20, Jan = Martin Mikkelsen <janm@transactionware.com> wrote:

On 7 Apr = 2026, at 18:53, Konstantin Belousov <kib@freebsd.org> = wrote:

On Tue, Apr 07, 2026 at = 05:02:05PM +0200, Jan Martin Mikkelsen wrote:
Hi,

I am consistently getting the panic below while = building lang/perl5.42. This is the command from the perl build that = triggers the panic:

/usr/bin/strip = /ports-work/usr/ports/lang/perl5.42/work/stage/usr/local/bin/perl5.42.0
CURRENT on aarch64, with a kernel from last week, also with a later = one from the  weekend. A kernel from mid-January worked = fine.

I can reproduce on demand, no parallelism in the build = required.

Does this look familiar to anyone?

panic: = cache_vop_rename: lingering negative entry
cpuid =3D 4
time =3D = 1775410763
KDB: stack backtrace:
db_trace_self() at = db_trace_self
db_trace_self_wrapper() at = db_trace_self_wrapper+0x38
vpanic() at vpanic+0x1a0
panic() at = panic+0x48
cache_vop_rename() at = cache_vop_rename+0xb0
zfs_do_rename() at = zfs_do_rename+0xafc
zfs_freebsd_rename() at = zfs_freebsd_rename+0x5c
VOP_RENAME_APV() at = VOP_RENAME_APV+0x44
kern_renameat () at = kern_renameat+0x574
do_el0_sync() at = do_el0_sync+0x5f8
handle_el0_sync() at handle_el0_sync+0x4c
--- = exception, esr 0x56000000
KDB: enter: panic
[ thread pid 81230 tid = 101738 ]
Stopped at kdb_enter+0x48: str xzr, [x19, = #3072]

Is it reproducable on UFS and/or = tmpfs?

Successful completion (no panic) when the = work directory is on UFS, and when the work directory is on tmpfs. I = didn=E2=80=99t try multiple times, but it never works on = ZFS.

The panic consistently reproduces on a ZFS = filesystem with the properties  =E2=80=9Cutf8only=3Don=E2=80=9D and = "normalization=3DformD=E2=80=9D.

A ZFS file system with = =E2=80=9Cutf8only=3Doff=E2=80=9D and "normalization=3Dnone=E2=80=9D = works fine.

As far as I can see, strip makes a simple rename(2) = call, and testing rename(2) works fine (as expected). Running the same = strip command on the same files on a fresh system works fine.

The = smallest reproducer I have at the moment is building lang/perl5.42.0 = with a workdir on a ZFS filesystem enforcing UTF8.

I = am now sure that the reason is that the options you used cause the = same
inode to have more than one name (but not hardlinks).  I = remember that
zfs had option to be case-insensitive, but I may = mis-remember.

The solution, in any case, is to either stop using = namecache when these
options are activated, or at least purge all = cached entries that has the
given dst when the dst vnode is renamed = or deleted.

Somebody who knows zfs would be needed to make the = change.

I had a look at the ZFS source, and found = this:

      /*
=        * Only use the name cache if = we are looking for a
       * = name on a file system that does not require normalization
=        * or case folding.  We = can also look there if we happen to be
=        * on a non-normalizing, mixed = sensitivity file system IF we
=        * are looking for the exact = name (which is always the case on
=        * FreeBSD).
=        */
=       zfsvfs->z_use_namecache =3D = !zfsvfs->z_norm ||
=           ((zfsvfs->z= _case =3D=3D ZFS_CASE_MIXED) &&
=           !(zfsvfs->z= _norm & ~U8_TEXTPREP_TOUPPER));


The call to = cache_vop_rename() which causes the panic is not protected by an =E2=80=9C= if (zfsvfs->z_use_namecache)=E2=80=9D, unlike the rest of the code = that uses that to decide whether or not to use the = namecache.

Elsewhere in zfs_vnops_os.c, there is another call to = a cache_vop* function, which is protected by a test:

=       if (zfsvfs->z_use_namecache)
=             &n= bsp; cache_vop_rmdir(dvp, vp);

It seems to me that this = patch could resolve the problem. Does this seem reasonable?

--- = a/src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c = 2026-03-28 20:55:06.000000000 1100
+++ = b/src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c = 2026-03-28 20:55:06.000000000 1100
@@ -3524,7 +3524,7 = @@
= = = =    ZRENAMING, NULL));
= }
= = }
- = = if (error =3D=3D 0) {
+ if (error =3D=3D 0 && = zfsvfs->z_use_namecache) {
cache_vop_rename(sdvp, *svpp, = tdvp, *tvpp, scnp, tcnp);
}
}


Yes, = but please test.
If works for you, please either create a Github PR = or a review on the
FreeBSD' phab.

That does seem = to fix the panic. I=E2=80=99ll do a GitHub PR. Thanks for your = help.



= --Apple-Mail=_2599E8D2-7F1C-46BE-B974-E34D442877AA--