Date: Thu, 14 Dec 2023 06:58:31 +0000 From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 275594] High CPU usage by arc_prune; analysis and fix Message-ID: <bug-275594-3630-1PQNkikAXX@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-275594-3630@https.bugs.freebsd.org/bugzilla/> References: <bug-275594-3630@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D275594 --- Comment #12 from Seigo Tanimura <seigo.tanimura@gmail.com> --- (In reply to Seigo Tanimura from comment #10) I have added the fix to enable the extra vnode recycling and tested with the same setup. Source on GitHub: - Repo: https://github.com/altimeter-130ft/freebsd-freebsd-src - Branches - Fix: release/14.0.0/release-14_0_0-p2-topic-openzfs-arc_prune-interval-= fix - Counters atop Fix: release/14.0.0/release-14_0_0-p2-topic-openzfs-arc_prune-interval-counters Test setup: The same as "Ongoing test" in bug #275594, comment #6. - vfs.vnode.vnlru.max_free_per_call: 4000000 (=3D=3D vfs.vnode.vnlru.max_free_per_call) - vfs.zfs.arc.prune_interval: 1000 (my fix for arc_prune interval enabled) - vfs.vnode.vnlru.extra_recycle: 1 (extra vnode recycle fix enabled) Build time: 06:50:05 (312 pkgs / hr) Counters after completing the build, with some remarks: # The iteration attempts in vnlru_free_impl(). # This includes the retry from the head of vnode_list. vfs.vnode.free.free_attempt: 33934506866 # The number of the vnodes recycled successfully, including vtryrecycle(). vfs.vnode.free.free_success: 42945537 # The number of the successful recycles in phase 2 upon the VREG (regular f= ile) vnodes. # - cleanbuf_vmpage_only: the vnodes held by the clean bufs and resident VM pages only. # - cleanbuf_only: the vnodes held by the clean bufs only. vfs.vnode.free.free_phase2_retry_reg_cleanbuf_vmpage_only: 845659 vfs.vnode.free.free_phase2_retry_reg_cleanbuf_only: 3 # The number of the iteration skips due to a held vnode. ("phase 2" hereaft= er) # NB the successful recycles in phase 2 are not included. vfs.vnode.free.free_phase2_retry: 8923850577 # The number of the phase 2 skips upon the VREG vnodes. vfs.vnode.free.free_phase2_retry_reg: 8085735334 # The number of the phase 2 skips upon the VREG vnodes in use. # Almost all phase 2 skips upon VREG fell into this. vfs.vnode.free.free_phase2_retry_reg_inuse: 8085733060 # The number of the successful recycles in phase 2 upon the VDIR (directory) vnodes. # - free_phase2_retry_dir_nc_src_only: the vnodes held by the namecache ent= ries only. vfs.vnode.free.free_phase2_retry_dir_nc_src_only: 2234194 # The number of the phase 2 skips upon the VDIR vnodes. vfs.vnode.free.free_phase2_retry_dir: 834902819 # The number of the phase 2 skips upon the VDIR vnodes in use. # Almost all phase 2 skips upon VDIR fell into this. vfs.vnode.free.free_phase2_retry_dir_inuse: 834902780 Other findings: - The behaviour upon the arc_prune thread CPU usage was mostly the same. - The peak reduced just a few percents, not likely to be the essential fi= x. - The namecache hit ratio degraded about 10 - 20%. - Maybe the recycled vnodes are looked up again, especially the directori= es. ----- The issue still exists essentially with the extra vnode recycle. Maybe the root cause is in ZFS rather than the OS. There are some suspicious findings on the in-memory dnode behaviour during = the tests so far: - vfs.zfs.arc_max does not enforce the max size of kstat.zfs.misc.arcstats.dnode_size. - vfs.zfs.arc_max: 4GB - vfs.zfs.arc.dnode_limit_percent: 10 (default) - sizeof(struct dnode_t): 808 bytes - Found by "vmstat -z | grep dnode_t". - kstat.zfs.misc.arcstats.arc_dnode_limit: 400MB (default, vfs.zfs.arc.dnode_limit_percent percent of vfs.zfs.arc_max) - ~495K dnodes. - kstat.zfs.misc.arcstats.dnode_size, max: ~ 1.8GB - ~2.2M dnodes. - Almost equal to the max observed number of the vnodes. - The dnode_t zone of uma(9) does not have the limit. >From above, the number of the in-memory dnodes looks like the bottleneck.=20 Maybe the essential solution is to configure vfs.zfs.arc.dnode_limit explic= itly so that ZFS can hold all dnodes required by the application in the memory. --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-275594-3630-1PQNkikAXX>