Date: Wed, 26 Mar 2025 09:46:21 +0200 From: Andriy Gapon <avg@FreeBSD.org> To: emulation@FreeBSD.org, stable@FreeBSD.org Subject: Re: panic: vrefact: wrong use count 0, linux emulation related Message-ID: <d3b0a784-dc4b-4b02-a158-ca70d7b3ce96@FreeBSD.org> In-Reply-To: <41288c50-3213-4d81-913c-d8897214a9e7@FreeBSD.org> References: <41288c50-3213-4d81-913c-d8897214a9e7@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Turns out that it's a known and already fixed issue. https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=274538 Not sure why I didn't see it before and only started seeing it now. I'll update my stable/14 to get the fix. On 24/03/2025 5:35 pm, Andriy Gapon wrote: > > Introduction. > > The affected system is stable/14, amd64. > The kernel is custom, it's configured with INVARIANTS. > > The problem started to happen rather reliably after a recent upgrade of > packages. I suspect that the trigger is in linux-nvidia-libs-570.124.04, but > the bug is in FreeBSD Linux emulation. > > The reason for my suspicion is that the crash happens when starting a graphical > Linux application in a Linux jail. And the crash involves a graphics-related > character device. > > Just in case, the jail itself, including the application, hasn't been changed. > Also, I haven't touched the base system recently. > > Details. > > VNASSERT failed: old > 0 not true at sys/kern/vfs_subr.c:3361 (vrefact) > 0xfffff802945df380: type VCHR state VSTATE_CONSTRUCTED op 0xffffffff8127b648 > usecount 1, writecount 0, refcount 39 seqc users 0 rdev 0xfffff8004565f400 > hold count flags () > flags () > lock type devfs: UNLOCKED > dev drm/128 > panic: vrefact: wrong use count 0 > cpuid = 1 > time = 1742796535 > KDB: stack backtrace: > db_trace_self_wrapper() at 0xffffffff8061eadb = db_trace_self_wrapper+0x2b/frame > 0xfffffe02476a0780 > kdb_backtrace() at 0xffffffff80956a57 = kdb_backtrace+0x37/frame 0xfffffe02476a0830 > vpanic() at 0xffffffff80907629 = vpanic+0x169/frame 0xfffffe02476a0970 > panic() at 0xffffffff80907403 = panic+0x43/frame 0xfffffe02476a09d0 > vrefact() at 0xffffffff809f08e4 = vrefact+0xb4/frame 0xfffffe02476a09f0 > fgetvp_lookup() at 0xffffffff808ac718 = fgetvp_lookup+0x88/frame 0xfffffe02476a0a30 > namei_setup() at 0xffffffff809e07ba = namei_setup+0x15a/frame 0xfffffe02476a0a80 > namei_emptypath() at 0xffffffff809e0499 = namei_emptypath+0x49/frame > 0xfffffe02476a0ae0 > namei() at 0xffffffff809e029f = namei+0x66f/frame 0xfffffe02476a0b40 > linux_kern_statat() at 0xffffffff8a09d24c = linux_kern_statat+0xfc/frame > 0xfffffe02476a0c70 > linux_newfstatat() at 0xffffffff8a09cfed = linux_newfstatat+0x6d/frame > 0xfffffe02476a0e00 > amd64_syscall() at 0xffffffff80c79f79 = amd64_syscall+0x189/frame > 0xfffffe02476a0f30 > fast_syscall_common() at 0xffffffff80c4fb9b = fast_syscall_common+0xf8/frame > 0xfffffe02476a0f30 > --- syscall (262, Linux ELF64, linux_newfstatat), rip = 0x813f13eee, rsp = > 0x7fffffffbd28, rbp = 0 --- > > As far as I understand, there is a Linux fstatat system call with AT_EMPTY_PATH > flag and the file descriptor of opened /dev/drm/128 device. > > Given that the crash happens in fgetvp_lookup -> vrefact, I think that it's > unlikely that there is a problem in that call path. > I believe that the problem is elsewhere in the Linux emulation code for working > with character devices. > > I think that the panic means that the corresponding file descriptor was open but > the associated vnode had usecount of zero. > > It looks like DTYPE_DEV (11) is used only in the linuxkpi code, e.g., > linux_dev_fdopen. > > Some info from kgdb. > > (kgdb) bt > #0 __curthread () at sys/amd64/include/pcpu_aux.h:57 > #1 doadump (textdump=textdump@entry=1) at sys/kern/kern_shutdown.c:423 > #2 0xffffffff80907121 in kern_reboot (howto=260) at sys/kern/kern_shutdown.c:541 > #3 0xffffffff80907698 in vpanic (fmt=0xffffffff80e35cf8 "%s: wrong use count > %d", ap=0xfffffe01adc909b0) at sys/kern/kern_shutdown.c:1021 > #4 0xffffffff80907403 in panic (fmt=<unavailable>) at sys/kern/kern_shutdown.c:945 > #5 0xffffffff809f08e4 in vrefact (vp=0xfffff8035b4bb700) at sys/kern/ > vfs_subr.c:3361 > #6 0xffffffff808ac718 in fgetvp_lookup (ndp=ndp@entry=0xfffffe01adc90b58, > vpp=vpp@entry=0xfffffe01adc90ac8) at sys/kern/kern_descrip.c:3134 > #7 0xffffffff809e07ba in namei_setup (ndp=ndp@entry=0xfffffe01adc90b58, > dpp=dpp@entry=0xfffffe01adc90ac8, pwdp=pwdp@entry=0xfffffe01adc90ac0) at sys/ > kern/vfs_lookup.c:383 > #8 0xffffffff809e0499 in namei_emptypath (ndp=ndp@entry=0xfffffe01adc90b58) at > sys/kern/vfs_lookup.c:466 > #9 0xffffffff809e029f in namei (ndp=ndp@entry=0xfffffe01adc90b58) at sys/kern/ > vfs_lookup.c:687 > #10 0xffffffff8a09d24c in linux_kern_statat (td=0xfffff804d50d7000, flag=16384, > fd=9, path=0x813fd846f <error: Cannot access memory at address 0x813fd846f>, > pathseg=UIO_USERSPACE, sbp=sbp@entry=0xfffffe01adc90c80) > at sys/compat/linux/linux_stats.c:103 > #11 0xffffffff8a09cfed in linux_newfstatat (td=<unavailable>, td@entry=<error > reading variable: value is not available>, args=0xfffff804d50d7400, > args@entry=<error reading variable: value is not available>) > at sys/compat/linux/linux_stats.c:620 > #12 0xffffffff80c79f79 in syscallenter (td=0xfffff804d50d7000) at sys/amd64/ > amd64/../../kern/subr_syscall.c:191 > #13 amd64_syscall (td=0xfffff804d50d7000, traced=<optimized out>) at sys/amd64/ > amd64/trap.c:1206 > > (kgdb) p *vp > $1 = {v_type = VCHR, v_state = VSTATE_CONSTRUCTED, v_irflag = 0, v_seqc = 0, > v_nchash = 1973399077, v_hash = 56314807, v_op = 0xffffffff8127b648 > <devfs_specops>, v_data = 0xfffff80055005200, v_mount = 0xfffffe0150b46100, > v_nmntvnodes = { > tqe_next = 0xfffff8038000da80, tqe_prev = 0xfffff8035b4bb8e8}, > {v_mountedhere = 0xfffff800452b9400, v_unpcb = 0xfffff800452b9400, v_rdev = > 0xfffff800452b9400, v_fifoinfo = 0xfffff800452b9400}, v_hashlist = {le_next = > 0x0, le_prev = 0x0}, > v_cache_src = {lh_first = 0x0}, v_cache_dst = {tqh_first = 0x0, tqh_last = > 0xfffff8035b4bb758}, v_cache_dd = 0x0, v_lock = {lock_object = {lo_name = > 0xffffffff80d1cf3c "devfs", lo_flags = 116588544, lo_data = 0, lo_witness = 0x0}, > lk_lock = 1, lk_exslpfail = 0, lk_pri = 64, lk_timo = 51}, v_interlock = > {lock_object = {lo_name = 0xffffffff80db24c1 "vnode interlock", lo_flags = > 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 0}, v_vnlock = > 0xfffff8035b4bb770, > v_vnodelist = {tqe_next = 0xfffff8035b4bbc40, tqe_prev = 0xfffff80369f48280}, > v_lazylist = {tqe_next = 0x0, tqe_prev = 0x0}, v_bufobj = {bo_lock = > {lock_object = {lo_name = 0xffffffff80df4394 "bufobj interlock", lo_flags = > 86179840, > lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, bo_ops = > 0xffffffff812b7190 <buf_ops_bio>, bo_object = 0x0, bo_synclist = {le_next = 0x0, > le_prev = 0x0}, bo_private = 0xfffff8035b4bb700, bo_clean = {bv_hd = {tqh_first > = 0x0, > tqh_last = 0xfffff8035b4bb828}, bv_root = {pt_root = 0x1}, bv_cnt = 0}, > bo_dirty = {bv_hd = {tqh_first = 0x0, tqh_last = 0xfffff8035b4bb848}, bv_root = > {pt_root = 0x1}, bv_cnt = 0}, bo_numoutput = 0, bo_flag = 0, bo_domain = 0, > bo_bsize = 512}, v_pollinfo = 0x0, v_label = 0x0, v_lockf = 0x0, v_rl = > {rl_waiters = {tqh_first = 0x0, tqh_last = 0xfffff8035b4bb890}, rl_currdep = > 0x0}, v_holdcnt = 32, v_usecount = 1, v_iflag = 0, v_vflag = 0, v_mflag = 0, > v_dbatchcpu = -1, v_writecount = 0, v_seqc_users = 0} > > (kgdb) p *fp > $3 = {f_flag = 3, f_count = 3, f_data = 0xfffff807120b5480, f_ops = > 0xffffffff84b46390 <linuxfileops>, f_vnode = 0xfffff8035b4bb700, f_cred = > 0xfffff8036f967d00, f_type = 11, f_vnread_flags = 0, {f_seqcount = {0, 0}, > f_pipegen = 0}, > f_nextoff = {0, 0}, f_vnun = {fvn_cdevpriv = 0x0, fvn_advice = 0x0}, f_offset > = 0} > > (kgdb) p *ndp > $5 = {ni_dirp = 0x813fd846f <error: Cannot access memory at address > 0x813fd846f>, ni_segflg = UIO_USERSPACE, ni_rightsneeded = 0xffffffff812005f0 > <cap_fstat_rights>, ni_startdir = 0x0, ni_rootdir = 0xfffff8003a922c40, > ni_topdir = 0xfffff8003a922c40, ni_dirfd = 9, ni_lcf = 0, ni_filecaps = > {fc_rights = {cr_rights = {144123984168878079, 288230376153808895}}, fc_ioctls = > 0x0, fc_nioctls = -1, fc_fcntls = 120}, ni_vp = 0x0, ni_dvp = 0xffffffffffffffff, > ni_resflags = 4, ni_debugflags = 3, ni_loopcnt = 0, ni_pathlen = 1, ni_next = > 0xffffffffffffffff <error: Cannot access memory at address 0xffffffffffffffff>, > ni_cnd = {cn_flags = 262596, cn_cred = 0xfffff8068c56b200, cn_nameiop = LOOKUP, > cn_lkflags = -1, cn_pnbuf = 0xfffff8002b11ec00 "", cn_nameptr = > 0xfffff8002b11ec00 "", cn_namelen = -1}, ni_cap_tracker = {tqh_first = 0x0, > tqh_last = 0xfffffe01adc90c08}, ni_dvp_seqc = 2915634432, ni_vp_seqc = 4294966785} > > I tried to look at linux_dev_fdopen() and other code in sys/compat/linuxkpi/ > common/src/linux_compat.c, but couldn't make much progress yet. > > I have the crash dump, so if there is anything else I can provide or look at... > > Thank you. -- Andriy Gapon
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?d3b0a784-dc4b-4b02-a158-ca70d7b3ce96>