Date: Fri, 23 Feb 2024 19:25:52 +0000 From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 275594] High CPU usage by arc_prune; analysis and fix Message-ID: <bug-275594-3630-zkQq1DdwjP@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-275594-3630@https.bugs.freebsd.org/bugzilla/> References: <bug-275594-3630@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D275594 --- Comment #67 from Peter Much <pmc@citylink.dinoex.sub.org> --- So, now I read all the material here. Great work! I had upgraded my deploy engine from 13.2-RELEASE to 13.3-BETA, and found (among some spurious messages from git) that it can no longer build gcc12. There is apparently no problem with rust or llvm15, but trying to build gcc= 12 does reproducibly crash (10 core, 16081M ram). Apparently the crash happens when gcc fully powers up its LTO for the first time: last pid: 37369; load averages: 9.35, 9.93, 9.27 up 0+03:15:25 07:2= 1:42 417 threads: 14 running, 379 sleeping, 24 waiting CPU: 55.4% user, 0.0% nice, 35.6% system, 0.1% interrupt, 8.8% idle Mem: 7047M Active, 6121M Inact, 2392M Wired, 984M Buf, 60M Free ARC: 518M Total, 45M MFU, 451M MRU, 128K Anon, 3990K Header, 17M Other 467M Compressed, 997M Uncompressed, 2.14:1 Ratio Swap: 15G Total, 15G Free PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 0 root -8 - 0B 2432K CPU4 4 3:14 99.79% kernel{a= rc_p 7 root -16 - 0B 48K CPU6 6 2:45 99.79% pagedaem= on{d 15 root 52 - 0B 16K CPU0 0 3:00 99.70% vnlru 37334 root 52 0 891M 789M pfault 1 0:37 89.24% lto1 37270 root 52 0 1017M 915M pfault 3 0:43 88.63% lto1 37324 root 52 0 831M 770M pfault 8 0:39 88.59% lto1 37338 root 52 0 843M 785M pfault 2 0:36 88.50% lto1 37333 root 52 0 889M 788M pfault 7 0:37 82.76% lto1 37269 root 52 0 1001M 882M pfault 5 0:42 82.09% lto1 37274 root 52 0 1004M 885M pfault 9 0:42 80.24% lto1 5 root 20 - 0B 1568K t->zth 9 0:02 1.02% zfskern{= arc_ 37360 root 20 0 14M 4940K CPU9 9 0:00 0.87% top This is the last output, at this point the system becomes unresponsive, and, when allowed neither to oom-kill nor panic, continues to consume 300% compu= te. Apparently these are the visible three apocalyptic riders (arc_prune, pagedaemon, vnlru) entertaining themselves. :/ Implementing the patch (i.e. five new git commits from the github repo) sol= ves the issue, and afterwards it looks like this: last pid: 11944; load averages: 7.13, 5.29, 5.77 up 0+03:48:45 16:1= 2:46 424 threads: 19 running, 381 sleeping, 24 waiting CPU: 67.9% user, 0.0% nice, 5.1% system, 0.0% interrupt, 27.0% idle Mem: 9308M Active, 2285M Inact, 20M Laundry, 3643M Wired, 865M Buf, 336M Fr= ee eRC: 1638M Total, 855M MFU, 575M MRU, 128K Anon, 11M Header, 198M Other 1305M Compressed, 2980M Uncompressed, 2.28:1 Ratio Swap: 15G Total, 15G Free PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 11579 root 103 0 1269M 1066M CPU6 6 4:09 100.00% lto1 11605 root 103 0 1263M 1052M CPU3 3 4:08 99.87% lto1 11589 root 103 0 1295M 1091M CPU8 8 4:09 99.87% lto1 11599 root 103 0 1259M 1027M CPU9 9 4:08 99.87% lto1 11588 root 103 0 1263M 1035M CPU7 7 4:09 99.87% lto1 11590 root 103 0 1287M 1058M CPU5 5 4:08 99.87% lto1 11598 root 103 0 1311M 1082M CPU1 1 4:08 99.74% lto1 0 root -8 - 0B 2448K - 6 0:03 6.83% kernel{a= rc_p 5 root -8 - 0B 1568K RUN 9 0:03 5.80% zfskern{= arc_ 7 root -16 - 0B 48K psleep 2 0:37 3.11% pagedaem= on{d I'm a bit worried the thing is still reluctant to page out, but otherwise t= his looks good. --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-275594-3630-zkQq1DdwjP>