Date: Wed, 24 Jan 2024 10:47:34 +0000 From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 275594] High CPU usage by arc_prune; analysis and fix Message-ID: <bug-275594-3630-yxfu5K4vpo@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-275594-3630@https.bugs.freebsd.org/bugzilla/> References: <bug-275594-3630@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D275594 --- Comment #36 from Seigo Tanimura <seigo.tanimura@gmail.com> --- (In reply to Thomas Mueller from comment #34) I have backported the fix to stable/13 (13.3-PRERELEASE) and tested poudriere-bulk(8). The fix has also been applied to the main and stable/14 branches without any changes. Thomas, would you mind testing the backported fix to see if poudriere's bui= ld time changes in any way? * Sources on GitHub: - Repo - https://github.com/altimeter-130ft/freebsd-freebsd-src - Branches - main (Current) - Fix only - topic-openzfs-arc_prune-regulation-fix - Fix and counters - topic-openzfs-arc_prune-regulation-counters - No changes from the fix on 14.0.0-RELEASE-p2. - stable/14 (14-STABLE) - Fix only - stable/14-topic-openzfs-arc_prune-regulation-fix - Fix and counters - release/14.0.0/release-14_0_0-p2-topic-openzfs-arc_prune-regulation-counters - No changes from the fix on 14.0.0-RELEASE-p2. - releng/14.0 (14.0-RELEASE) - Fix only - release/14.0.0/release-14_0_0-p2-topic-openzfs-arc_prune-regulation= -fix - Fix and counters - release/14.0.0/release-14_0_0-p2-topic-openzfs-arc_prune-regulation-counters - The original fix branches. - stable/13 (13-STABLE / 13.3-PRERELEASE) - Fix only - stable/13-topic-openzfs-arc_prune-regulation-fix - Fix and counters - stable/13-topic-openzfs-arc_prune-regulation-counters - Backported changes - A fix equivalent to FreeBSD-EN-23:18.openzfs. - The ARC pruning task pileup is avoided by a single flag and the atomic operations on it. - Seigo's fix. - The ZFS vnode accounting, including the counters. - The ARC pruning regulation. - The improvement on vnlru_free_impl() - Changes not backported - Seigo's fix. - The counters regarding to the autotuning of ZFS ARC meta, the balancing parameter of the ARC data and metadata. - Those counters have changed significantly between 13-STABLE and 14-STABLE. * Test results Test Summary: - Branch: stable/13-topic-openzfs-arc_prune-regulation-counters - Date: 24 Jan 2024 00:10Z - 24 Jan 2024 05:59Z - Build time: 05:48:30 (367 pkgs / hr) - Failed port(s): 2 - Skipped port(s): 2 - Setup - sysctl(3) - vfs.zfs.arc_max: 4294967296 - 4GB. - vfs.zfs.arc.dnode_limit=3D8080000000 - 2.5 * (vfs.vnode.param.limit) * sizeof(dnode_t) - 2.5: experimental average dnodes per znode (2.0) + margin (0.5) - poudriere-bulk(8) - USE_TMPFS=3D"wrkdir data localbase" Result Chart Archive: (poudriere-bulk-13_3_prerelease-2024-01-24_09h10m00s.= 7z, Attachment #247921) - zfs-znodes-and-dnodes.png - The counts of the ZFS znodes and dnodes. - zfs-arc-pruning-regulation.png - The counts of the ARC prune triggers by ZFS and the skips by the fix. - zfs-dnodes-and-freeing-activity.png - The freeing activity of the ZFS znodes and dnodes. - vnode-free-calls.png - The calls to the ZFS vnode freeing functions. * Findings and Analysis - The build time was shorter than 14.0-RELEASE because emulators/mame, star= ted in 4.5 hours, benefitted from ccache and completed in just 10 minutes. That does not work on 14.0-RELEASE and all sources have to be rebuilt. - If the emulators/mame build did not use ccache, its build would take ~2= .5 hours and the whole poudriere-bulk(8) would complete in ~7 hours. This is = the same time as 14.0-RELEASE. - No ARC pruning happened during poudriere-bulk(8). - The only one pruning happened while settling down the system before poudriere-bulk(8). - On OpenZFS 2.1, the ARC pruning is not triggered by the excess unevicta= ble size in the ARC. - Above works on OpenZFS 2.2 in 14-STABLE. - Only the overcommitted dnodes and metadata size trigger the ARC on OpenZFS 2.1. - vfs.zfs.arc.dnode_limit in my setup effectively disabled the ARC prunin= g on OpenZFS 2.1. - Maybe this should be reverted to the default and retested. - The zfskern{arc_evict} thread used the CPU up to 100% in the final ~1 hou= r of the build. - The reason is not clear. - There were no significant affects to the system. --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-275594-3630-yxfu5K4vpo>