From nobody Sun Sep 24 21:46:49 2023 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Rv02f2FpBz4tWjN; Sun, 24 Sep 2023 21:46:50 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Rv02f1Wrgz4ZFX; Sun, 24 Sep 2023 21:46:50 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1695592010; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=9QSx/LXuVAClD9+8q7i6FxDKbIZE5u2+C5Ud3nlKpn4=; b=DnkX01S7xbod2eTXeNFlpLFnwqrrVelz9m6xoYnixXfmaeGWU1wXjssknqrmSsOl98D1q3 fw4/c6jJdUwKpCyn6i0T7NFbOEyxK+w74cC8YhZUIXhBc/EMtzxqENzISbhF/6QBwcfOD7 RcZAyylKTjBuyquMD376tHHvlF21Ae/v2K5cPuso5kItFxtlPknZR8PlUP/xsIiW6AVqnV zxYy4WZv2grmNQv1ef7i6UqfRr1K949SnfAqFebeLYvJvJjcyfM/FxlNWuN+GvW/QPPDkq RLV2upzM8ieVjURQ1M33WbsJiU0Zib1+FanbRRFov9tJpTtaBPdIiZFFoppiMA== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1695592010; a=rsa-sha256; cv=none; b=lQxNI6riaZ/AG9R3iqeSDZigl62m4ZRrZE/TCoDrc1BsCkE3wU6vwFxuRZHqcsITsVVkt2 9vecEj1pZADSsQ2TDuAbvKqWgH6UHiN3H1rcb2M0JATp0ZtJGZMFoQw3jcbOGQKNQptlQl pgd2pE7L/ZLtIMgV3Ja5DZW9TtSJCuNUC4PJO4pUA8vmKYQFivg9KoGtAXIG8KZG/Mizh6 qnaPlpImE5AnnmpIWmWr4qiqXPYFii4VKaq90gVYezh/rinRZqj5bI6vZ9KK64Iv8fEPG+ AJBuAqGf0iegYkC1H+959iF6plpoZmrxVqu4HFZCwuJAD0ekvyoNsTTAOjj0yQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1695592010; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=9QSx/LXuVAClD9+8q7i6FxDKbIZE5u2+C5Ud3nlKpn4=; b=V7NLEJLdOe2qeIlcKOiEYveph8frAF3wzQEQ1vUE/UxBljM7XjIhlpoBHdI1fhsDKf10Gk wj5Ih3ajy8YBPoHrQeHIJ/wGPJ+Wlrojjs6BSM8JXKf9wo0KKZgeOZZcDEpuPp+l6ErW4a fMF4HGrTBNB5Qnn5NdGfGK7TJWc2//PKu+nR8bgCbJJo0d8R/Z9cA0nx8lB2FbM2k3JQSo Rq7YwTcIHxE/CkPH1xh9UA/meYEdYHfFqLA8EI9rXGA2clhSQQ1c7ys1qyReJ7KihUGV/s Q5q5s2SP7unmYgHAfNH+5XRGLcbkfnK8pqYTPWgHwMRnQ1SdwBuIQrsfWkFjMQ== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4Rv02f0VwCz13fh; Sun, 24 Sep 2023 21:46:50 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.17.1/8.17.1) with ESMTP id 38OLknVt030467; Sun, 24 Sep 2023 21:46:49 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.17.1/8.17.1/Submit) id 38OLknQS030464; Sun, 24 Sep 2023 21:46:49 GMT (envelope-from git) Date: Sun, 24 Sep 2023 21:46:49 GMT Message-Id: <202309242146.38OLknQS030464@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: Mateusz Guzik Subject: git: aeb0da3771a5 - releng/14.0 - vfs: drop one vnode list lock trip during vnlru free recycle List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: mjg X-Git-Repository: src X-Git-Refname: refs/heads/releng/14.0 X-Git-Reftype: branch X-Git-Commit: aeb0da3771a504de37c2311ea838b0edc9fdee98 Auto-Submitted: auto-generated The branch releng/14.0 has been updated by mjg: URL: https://cgit.FreeBSD.org/src/commit/?id=aeb0da3771a504de37c2311ea838b0edc9fdee98 commit aeb0da3771a504de37c2311ea838b0edc9fdee98 Author: Mateusz Guzik AuthorDate: 2023-09-14 14:35:40 +0000 Commit: Mateusz Guzik CommitDate: 2023-09-24 21:45:35 +0000 vfs: drop one vnode list lock trip during vnlru free recycle vnlru_free_impl would take the lock prior to returning even though most frequent caller does not need it. Unsurprisingly vnode_list mtx is the primary bottleneck when recycling and avoiding the useless lock trip helps. Setting maxvnodes to 400000 and running 20 parallel finds each with a dedicated directory tree of 1 million vnodes in total: before: 4.50s user 1225.71s system 1979% cpu 1:02.14 total after: 4.20s user 806.23s system 1973% cpu 41.059 total That's 34% reduction in total real time. With this the block *remains* the primary bottleneck when running on ZFS. Approved by: re (gjb) (cherry picked from commit 74be676d87745eb727642f6f8329236c848929d5) (cherry picked from commit 206dd9d1a82df140a6071545a2dc558e8d9f5ad0) --- sys/kern/vfs_subr.c | 43 +++++++++++++++++++++++++++++++++++-------- 1 file changed, 35 insertions(+), 8 deletions(-) diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c index 1c6827ba0587..80ec15f78028 100644 --- a/sys/kern/vfs_subr.c +++ b/sys/kern/vfs_subr.c @@ -1290,13 +1290,14 @@ vnlru_free_impl(int count, struct vfsops *mnt_op, struct vnode *mvp) mtx_assert(&vnode_list_mtx, MA_OWNED); if (count > max_vnlru_free) count = max_vnlru_free; + if (count == 0) { + mtx_unlock(&vnode_list_mtx); + return (0); + } ocount = count; retried = false; vp = mvp; for (;;) { - if (count == 0) { - break; - } vp = TAILQ_NEXT(vp, v_vnodelist); if (__predict_false(vp == NULL)) { /* @@ -1319,6 +1320,7 @@ vnlru_free_impl(int count, struct vfsops *mnt_op, struct vnode *mvp) */ TAILQ_REMOVE(&vnode_list, mvp, v_vnodelist); TAILQ_INSERT_TAIL(&vnode_list, mvp, v_vnodelist); + mtx_unlock(&vnode_list_mtx); break; } if (__predict_false(vp->v_type == VMARKER)) @@ -1366,18 +1368,28 @@ vnlru_free_impl(int count, struct vfsops *mnt_op, struct vnode *mvp) */ vtryrecycle(vp); count--; + if (count == 0) { + break; + } mtx_lock(&vnode_list_mtx); vp = mvp; } + mtx_assert(&vnode_list_mtx, MA_NOTOWNED); return (ocount - count); } +/* + * XXX: returns without vnode_list_mtx locked! + */ static int vnlru_free_locked(int count) { + int ret; mtx_assert(&vnode_list_mtx, MA_OWNED); - return (vnlru_free_impl(count, NULL, vnode_list_free_marker)); + ret = vnlru_free_impl(count, NULL, vnode_list_free_marker); + mtx_assert(&vnode_list_mtx, MA_NOTOWNED); + return (ret); } void @@ -1389,7 +1401,7 @@ vnlru_free_vfsops(int count, struct vfsops *mnt_op, struct vnode *mvp) VNPASS(mvp->v_type == VMARKER, mvp); mtx_lock(&vnode_list_mtx); vnlru_free_impl(count, mnt_op, mvp); - mtx_unlock(&vnode_list_mtx); + mtx_assert(&vnode_list_mtx, MA_NOTOWNED); } struct vnode * @@ -1534,7 +1546,7 @@ vnlru_under_unlocked(u_long rnumvnodes, u_long limit) } static void -vnlru_kick(void) +vnlru_kick_locked(void) { mtx_assert(&vnode_list_mtx, MA_OWNED); @@ -1544,6 +1556,15 @@ vnlru_kick(void) } } +static void +vnlru_kick(void) +{ + + mtx_lock(&vnode_list_mtx); + vnlru_kick_locked(); + mtx_unlock(&vnode_list_mtx); +} + static void vnlru_proc(void) { @@ -1574,6 +1595,7 @@ vnlru_proc(void) */ if (rnumvnodes > desiredvnodes) { vnlru_free_locked(rnumvnodes - desiredvnodes); + mtx_lock(&vnode_list_mtx); rnumvnodes = atomic_load_long(&numvnodes); } /* @@ -1751,6 +1773,7 @@ vn_alloc_hard(struct mount *mp) rnumvnodes = atomic_load_long(&numvnodes); if (rnumvnodes + 1 < desiredvnodes) { vn_alloc_cyclecount = 0; + mtx_unlock(&vnode_list_mtx); goto alloc; } rfreevnodes = vnlru_read_freevnodes(); @@ -1770,22 +1793,26 @@ vn_alloc_hard(struct mount *mp) */ if (vnlru_free_locked(1) > 0) goto alloc; + mtx_assert(&vnode_list_mtx, MA_NOTOWNED); if (mp == NULL || (mp->mnt_kern_flag & MNTK_SUSPEND) == 0) { /* * Wait for space for a new vnode. */ - vnlru_kick(); + mtx_lock(&vnode_list_mtx); + vnlru_kick_locked(); vn_alloc_sleeps++; msleep(&vnlruproc_sig, &vnode_list_mtx, PVFS, "vlruwk", hz); if (atomic_load_long(&numvnodes) + 1 > desiredvnodes && vnlru_read_freevnodes() > 1) vnlru_free_locked(1); + else + mtx_unlock(&vnode_list_mtx); } alloc: + mtx_assert(&vnode_list_mtx, MA_NOTOWNED); rnumvnodes = atomic_fetchadd_long(&numvnodes, 1) + 1; if (vnlru_under(rnumvnodes, vlowat)) vnlru_kick(); - mtx_unlock(&vnode_list_mtx); return (uma_zalloc_smr(vnode_zone, M_WAITOK)); }