Date: Mon, 13 Oct 2025 14:51:26 +0000 From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 290207] [ZFS] lowering "vfs.zfs.arc.max" to a low value causes kernel threads of "arc_evict" to use 91% CPU and disks to wait. System gets unresponsive... Message-ID: <bug-290207-227@https.bugs.freebsd.org/bugzilla/>
index | next in thread | raw e-mail
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=290207 Bug ID: 290207 Summary: [ZFS] lowering "vfs.zfs.arc.max" to a low value causes kernel threads of "arc_evict" to use 91% CPU and disks to wait. System gets unresponsive... Product: Base System Version: 15.0-STABLE Hardware: amd64 OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: nbe@vkf-renzel.de Hi, under FreeBSD 15.0-BETA1 lowering the maximum ARC usage using the sysctl "vfs.zfs.arc.max" to something like 2G, 1G or 512M eventually causes the kernel threads of "arc_evict" to use 91% CPU (or more) and to have the disks wait. This makes the whole system unresponsive. To reproduce quickly: sysctl vfs.zfs.arc.max=1073741824 zpool scrub <YOURPOOLNAME> Then take a look at the outputs of "top 5" and "gstat": ---------------------------------- SNIP ---------------------------------- last pid: 16317; load averages: 0.69, 0.36, 0.16; battery: 100% up 0+03:04:00 16:35:34 600 threads: 16 running, 530 sleeping, 54 waiting CPU: 0.1% user, 0.0% nice, 24.5% system, 0.2% interrupt, 75.3% idle Mem: 25M Active, 547M Inact, 2970M Wired, 10G Free ARC: 1059M Total, 231M MFU, 722M MRU, 400K Anon, 16M Header, 89M Other 858M Compressed, 4170M Uncompressed, 4.86:1 Ratio Swap: 16G Total, 16G Free PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 0 root 59 - 0B 4176K CPU11 11 4:00 92.83% kernel{arc_evict_2} 0 root 59 - 0B 4176K CPU8 8 4:19 92.48% kernel{arc_evict_1} 0 root 59 - 0B 4176K CPU1 1 3:59 92.01% kernel{arc_evict_0} 6 root -13 - 0B 1616K aw.aew 6 0:52 7.18% zfskern{txg_thread_enter} 6 root 1 - 0B 1616K tq_adr 2 0:16 2.49% zfskern{arc_evict} (while) dT: 1.000s w: 1.000s L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name 0 206 206 3295 0.156 0 0 0.000 3.2| nda0 0 123 123 1967 0.192 0 0 0.000 2.4| nda1 ---------------------------------- SNIP ---------------------------------- The disks are NVME SSDs capable of 800MB/s and a lot of IOPs. Their normal stats are: ---------------------------------- SNIP ---------------------------------- dT: 1.000s w: 1.000s L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name 3 6406 6406 803622 0.465 0 0 0.000 100.0| nda0 0 7369 7369 807316 0.076 0 0 0.000 50.8| nda1 ---------------------------------- SNIP ---------------------------------- To get back to somewhat normal behaviour, you have to set ARC's maximum to a higher value: sysctl vfs.zfs.arc.max=8589934592 (Setting it to 0 [zero] does not help.) This misbehaviour did not happen under the original, "old" FreeBSD ZFS codebase like under 11.1-STABLE. As far as I remember it also did not happen under 12.4-RELEASE. My old poudriere building system (old 8-core Ryzen, 16GB RAM) was using the "old" ZFS codebase and ARC was limited to 1GB in order to give poudriere enough RAM for TMPFS for its eight worker threads back then. No problems at all. In my understanding when I limit the cache to a too small value then the accesses to the disks should increase vastly but should NOT generate some eviction threads that causes the disks to wait for them... Thanks for looking into it and regards, Nils -- You are receiving this mail because: You are the assignee for the bug.home | help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-290207-227>
