Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 08 Dec 2023 10:17:49 +0000
From:      bugzilla-noreply@freebsd.org
To:        fs@FreeBSD.org
Subject:   [Bug 275594] High CPU usage by arc_prune; analysis and fix
Message-ID:  <bug-275594-3630-ip1A0ZyrVT@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-275594-3630@https.bugs.freebsd.org/bugzilla/>
References:  <bug-275594-3630@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D275594

--- Comment #6 from Seigo Tanimura <seigo.tanimura@gmail.com> ---
(In reply to Seigo Tanimura from comment #5)

The build under the following setting have completed:

- vfs.vnode.vnlru.max_free_per_call: 10000 (out-of-box)
- vfs.zfs.arc.prune_interval: 1000 (my fix enabled)

Build time: 07:11:02 (292 pkgs / hr)
Max vfs.vnode.stats.count: ~2.2M
Max ARC memory size: ~5.6GB

NB devel/ocl-icd failed because pkg-static was killed by the kernel for tak=
ing
too long to page in.  31 ports were skipped because of this failure.  This
error was often seen on 14.0-RELEASE-p0, indicating an obstacle upon the
executable file access.

This result is better than the baseline (14.0-RELEASE-p2) and worse than my
original fix shown in the description.  Although prune_interval avoided the
contention upon vnode_list_mtx somehow, this setup also limited the ARC pru=
ning
performance, introducing another pressure including the overcommit upon the=
 ARC
memory size.

I conclude this setup is not optimal nor recommended.

-----

Ongoing test:

- vfs.vnode.vnlru.max_free_per_call: 4000000 (=3D=3D
vfs.vnode.vnlru.max_free_per_call)
- vfs.zfs.arc.prune_interval: 1000 (my fix enabled)

This setup allows the unlimited workload to the ARC pruning under the
configured interval.

Another object of this test is the measurement of the vnode number ZFS requ=
ests
the OS to reclaim.  As long as this value is below 100000
(vfs.vnode.vnlru.max_free_per_call in my first test), the system behaviour =
and
test results are expected to be the same as my first test.

A glance on 30 minutes after the build start:

- The activity of arc_prune is mostly the same as the first test; the CPU u=
sage
occasionally surges up to 30%, but it does not stay for more than 1 second =
so
far.
- The average number of the vnodes ZFS requests to reclaim: ~44K.
  - vfs.vnode.stats.count: ~1.2M.
  - The default vfs.vnode.vnlru.max_free_per_call of 10K did regulate the A=
RC
pruning work.
  - I will keep my eyes on this figure, especially if it exceeds 100K.
- The ARC memory size is strictly regulated as configured by vfs.zfs.arc_ma=
x.
  - The ARC pruning starts when the ARC memory size reaches ~4.1GB.
  - The ARC pruning does not happen as long as the ARC memory size is below
4.0GB.

The finding regarding the ARC memory size is something new to me.  Maybe the
vnode number requested for the reclaim by ZFS is calculated very carefully =
and
precisely, so we should actually honour that figure to keep the system heal=
thy.

I first treated this test as an extreme case, but maybe this should be
evaluated as a working setup.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-275594-3630-ip1A0ZyrVT>