From nobody Wed Jan 24 10:47:34 2024 X-Original-To: fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TKgdg3kktz581H9 for ; Wed, 24 Jan 2024 10:47:35 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4TKgdg2ZLYz4FhX for ; Wed, 24 Jan 2024 10:47:35 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1706093255; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TYof8Rla+7ykZQHHJuD7V/MoabI3YwpsI+1wDyNf69k=; b=Lc7tWHa27oTcyywfXTd85ERu4ILW+7a4l3Tz9hY8xAM/nBQmlOcwPPM3gUXuxUWj7FNG1m /p1LNPgx5o9Y9Xf4foZ1VmwxB3XCpV2gbkpWoklHFgAQsdVTrRpQ9J/1j69jLRltsOuIOK p4Z38Mxti+wkkMES2m+3MZi6io8qnI7rAy3kglpWEpbyLGY9x+rfYtlPfDbb/ckI6kLAOi z0uu37TJt7buCcuaHPsbdg1bTZK3hu4nTP2l17UX4ZvDpK7nj1x+y1LXsqW3bNTGRf6v1c 97AhtjWmFV+g1CmsU/DFXGeT950ysNcWM+NtcAKGlRi9QydQleYWu3LY50Gcuw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1706093255; a=rsa-sha256; cv=none; b=Tt1AaToGYeSh7FQxMbYEVq9DH4sOSZhYERsh41MShKpp0kMaKQhRxUeQ5GN9Pxgf5Ggvur OtBHWc67eKBTgmdlY+m/5D1Y+APDHcoiHHyMXQpz9taW/RUL3QWl+xe73eU3bwC/greyOd dihtw6g+jLQFyP/dqaSB5fa/Rn3vksliuhB7ooKVcAoZf8+OvtadZrs9Q915Pov/T8Qvij ekKjIDP8smgEnN1JPwpCzbBP0aerH44mEgI+JTaTy+wBBtqg4PJVS0isw9I81BawgN8BM6 SSYTsYtj66redreXTSe6GXjiXjCQIV7QIBB+n9iXw2iKJ5m4Uk0nxw9IbfJs3g== Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4TKgdg1dmKz19MD for ; Wed, 24 Jan 2024 10:47:35 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 40OAlZmu091539 for ; Wed, 24 Jan 2024 10:47:35 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 40OAlZvh091537 for fs@FreeBSD.org; Wed, 24 Jan 2024 10:47:35 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 275594] High CPU usage by arc_prune; analysis and fix Date: Wed, 24 Jan 2024 10:47:34 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 14.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: seigo.tanimura@gmail.com X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@freebsd.org MIME-Version: 1.0 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D275594 --- Comment #36 from Seigo Tanimura --- (In reply to Thomas Mueller from comment #34) I have backported the fix to stable/13 (13.3-PRERELEASE) and tested poudriere-bulk(8). The fix has also been applied to the main and stable/14 branches without any changes. Thomas, would you mind testing the backported fix to see if poudriere's bui= ld time changes in any way? * Sources on GitHub: - Repo - https://github.com/altimeter-130ft/freebsd-freebsd-src - Branches - main (Current) - Fix only - topic-openzfs-arc_prune-regulation-fix - Fix and counters - topic-openzfs-arc_prune-regulation-counters - No changes from the fix on 14.0.0-RELEASE-p2. - stable/14 (14-STABLE) - Fix only - stable/14-topic-openzfs-arc_prune-regulation-fix - Fix and counters - release/14.0.0/release-14_0_0-p2-topic-openzfs-arc_prune-regulation-counters - No changes from the fix on 14.0.0-RELEASE-p2. - releng/14.0 (14.0-RELEASE) - Fix only - release/14.0.0/release-14_0_0-p2-topic-openzfs-arc_prune-regulation= -fix - Fix and counters - release/14.0.0/release-14_0_0-p2-topic-openzfs-arc_prune-regulation-counters - The original fix branches. - stable/13 (13-STABLE / 13.3-PRERELEASE) - Fix only - stable/13-topic-openzfs-arc_prune-regulation-fix - Fix and counters - stable/13-topic-openzfs-arc_prune-regulation-counters - Backported changes - A fix equivalent to FreeBSD-EN-23:18.openzfs. - The ARC pruning task pileup is avoided by a single flag and the atomic operations on it. - Seigo's fix. - The ZFS vnode accounting, including the counters. - The ARC pruning regulation. - The improvement on vnlru_free_impl() - Changes not backported - Seigo's fix. - The counters regarding to the autotuning of ZFS ARC meta, the balancing parameter of the ARC data and metadata. - Those counters have changed significantly between 13-STABLE and 14-STABLE. * Test results Test Summary: - Branch: stable/13-topic-openzfs-arc_prune-regulation-counters - Date: 24 Jan 2024 00:10Z - 24 Jan 2024 05:59Z - Build time: 05:48:30 (367 pkgs / hr) - Failed port(s): 2 - Skipped port(s): 2 - Setup - sysctl(3) - vfs.zfs.arc_max: 4294967296 - 4GB. - vfs.zfs.arc.dnode_limit=3D8080000000 - 2.5 * (vfs.vnode.param.limit) * sizeof(dnode_t) - 2.5: experimental average dnodes per znode (2.0) + margin (0.5) - poudriere-bulk(8) - USE_TMPFS=3D"wrkdir data localbase" Result Chart Archive: (poudriere-bulk-13_3_prerelease-2024-01-24_09h10m00s.= 7z, Attachment #247921) - zfs-znodes-and-dnodes.png - The counts of the ZFS znodes and dnodes. - zfs-arc-pruning-regulation.png - The counts of the ARC prune triggers by ZFS and the skips by the fix. - zfs-dnodes-and-freeing-activity.png - The freeing activity of the ZFS znodes and dnodes. - vnode-free-calls.png - The calls to the ZFS vnode freeing functions. * Findings and Analysis - The build time was shorter than 14.0-RELEASE because emulators/mame, star= ted in 4.5 hours, benefitted from ccache and completed in just 10 minutes. That does not work on 14.0-RELEASE and all sources have to be rebuilt. - If the emulators/mame build did not use ccache, its build would take ~2= .5 hours and the whole poudriere-bulk(8) would complete in ~7 hours. This is = the same time as 14.0-RELEASE. - No ARC pruning happened during poudriere-bulk(8). - The only one pruning happened while settling down the system before poudriere-bulk(8). - On OpenZFS 2.1, the ARC pruning is not triggered by the excess unevicta= ble size in the ARC. - Above works on OpenZFS 2.2 in 14-STABLE. - Only the overcommitted dnodes and metadata size trigger the ARC on OpenZFS 2.1. - vfs.zfs.arc.dnode_limit in my setup effectively disabled the ARC prunin= g on OpenZFS 2.1. - Maybe this should be reverted to the default and retested. - The zfskern{arc_evict} thread used the CPU up to 100% in the final ~1 hou= r of the build. - The reason is not clear. - There were no significant affects to the system. --=20 You are receiving this mail because: You are the assignee for the bug.=