Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 15 May 2025 09:34:08 GMT
From:      Olivier Certner <olce@FreeBSD.org>
To:        src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org
Subject:   git: d90c9c24e2c3 - main - vfs: vn_alloc(): Stop always calling vn_alloc_hard() and direct reclaiming
Message-ID:  <202505150934.54F9Y8hn014253@gitrepo.freebsd.org>

next in thread | raw e-mail | index | archive | help
The branch main has been updated by olce:

URL: https://cgit.FreeBSD.org/src/commit/?id=d90c9c24e2c3701949c47061b5ad198eedeebfb9

commit d90c9c24e2c3701949c47061b5ad198eedeebfb9
Author:     Olivier Certner <olce@FreeBSD.org>
AuthorDate: 2025-05-13 13:47:42 +0000
Commit:     Olivier Certner <olce@FreeBSD.org>
CommitDate: 2025-05-15 09:33:33 +0000

    vfs: vn_alloc(): Stop always calling vn_alloc_hard() and direct reclaiming
    
    Commit ab05a1cf321aca0f intended to revert commit 8733bc277a383cf5
    ("vfs: don't provoke recycling non-free vnodes without a good reason"),
    but due to intervening changes in commit 054f45e026d898bd ("vfs: further
    speed up continuous free vnode recycle"), it also had to revert part of
    it.  In particular, while removing the whole 'if (vn_alloc_cyclecount !=
    0)' block, it inadvertantly removed the code block resetting
    'vn_alloc_cyclecount' to 0 and skipping direct vnode reclamation (done
    further below through vnlru_free_locked_direct()), which had been
    outside the 'if' before the intervening commit.
    
    Removing this block instead of reinstating it in practice causes
    'vn_alloc_cyclecount' to (almost) never be 0, making vn_alloc() always
    call vn_alloc_hard(), which takes the 'vnode_list_mtx' mutex.  In other
    words, this disables the fast path.  [The reverted commit, which
    introduced the 'if (vn_alloc_cyclecount != 0)' guarding this block,
    actually never executed it because it also had the bug that
    'vn_alloc_cyclecount' would always stay at 0, hiding its usefulness.]
    
    Additionally, not skipping direct vnode reclamation even when there are
    less vnodes than 'kern.maxvnodes' not only causes unnecessary contention
    but also plain livelocks as vnlru_free_locked_direct() does not itself
    check whether there are actually "free" (not referenced) vnodes to be
    deallocated, and will blindly browse all the vnode list until it finds
    one (which it may not, or only a few ones at the end).  As the fast path
    was disabled, all threads in the system would soon be competing for the
    vnode list lock, outpacing the vnlru process that could never actually
    recycle vnodes in a more agressive manner (i.e., even if they have
    a non-zero hold count).  And we could more easily get into this
    situation, as each vnode allocation was reducing the count of "free"
    vnodes, even if entirely new vnodes could be allocated instead.  This
    part was mitigated by the vnlru process (before the tipping point
    described above), which explains why the mechanism would not always
    livelock.
    
    Not skipping direct vnode reclamation was arguably a bug introduced by
    the intervening commit (054f45e026d898bd), but was mitigated by
    vn_alloc_hard() not being called in the fast path.  The revert commit,
    by disabling the fast path, made it significantly annoying (to the point
    of getting a few livelocks a week in some of my workloads).
    
    Restore the reset of 'vn_alloc_cyclecount' to 0 and skip direct
    reclamation when the current number of vnodes is below the
    'kern.maxvnodes' limit, indicating we can start allocating a new 'struct
    vnode' right away.  While here, fix the comparison with the limit when
    'bumped' is true.
    
    Reviewed by:    kib (older version), avg
    Tested by:      avg, olce
    Fixes:          ab05a1cf321aca0f (revert of "vfs: don't provoke recycling non-free vnodes without a good reason")
    Fixes:          054f45e026d898bd ("vfs: further speed up continuous free vnode recycle")
    MFC after:      5 days
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D50338
---
 sys/kern/vfs_subr.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c
index fb624f751402..5fdd4b6c2474 100644
--- a/sys/kern/vfs_subr.c
+++ b/sys/kern/vfs_subr.c
@@ -1996,11 +1996,24 @@ vn_alloc_hard(struct mount *mp, u_long rnumvnodes, bool bumped)
 
 	mtx_lock(&vnode_list_mtx);
 
+	/*
+	 * Reload 'numvnodes', as since we acquired the lock, it may have
+	 * changed significantly if we waited, and 'rnumvnodes' above was only
+	 * actually passed if 'bumped' is true (else it is 0).
+	 */
+	rnumvnodes = atomic_load_long(&numvnodes);
+	if (rnumvnodes + !bumped < desiredvnodes) {
+		vn_alloc_cyclecount = 0;
+		mtx_unlock(&vnode_list_mtx);
+		goto alloc;
+	}
+
 	rfreevnodes = vnlru_read_freevnodes();
 	if (vn_alloc_cyclecount++ >= rfreevnodes) {
 		vn_alloc_cyclecount = 0;
 		vstir = true;
 	}
+
 	/*
 	 * Grow the vnode cache if it will not be above its target max after
 	 * growing.  Otherwise, if there is at least one free vnode, try to



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?202505150934.54F9Y8hn014253>