Date: Fri, 6 Apr 2018 10:33:19 -0700 (PDT) From: Don Lewis <truckman@FreeBSD.org> To: Mark Johnston <markj@FreeBSD.org> Cc: Andriy Gapon <avg@FreeBSD.org>, Bryan Drewery <bdrewery@FreeBSD.org>, Peter Jeremy <peter@rulingia.com>, Jeff Roberson <jroberson@jroberson.net>, FreeBSD current <freebsd-current@FreeBSD.org> Subject: Re: Strange ARC/Swap/CPU on yesterday's -CURRENT Message-ID: <tkrat.159e0d3961e44e63@FreeBSD.org> In-Reply-To: <tkrat.5960d3aa016999f6@FreeBSD.org> References: <20180306221554.uyshbzbboai62rdf@dx240.localdomain> <20180307103911.GA72239@kloomba> <20180311004737.3441dbf9@thor.intern.walstatt.dynvpn.de> <alpine.BSF.2.21.1803111038460.1232@desktop> <20180320070745.GA12880@server.rulingia.com> <2b3db2af-03c7-65ff-25e7-425cfd8815b6@FreeBSD.org> <1fd2b47b-b559-69f8-7e39-665f0f599c8f@FreeBSD.org> <tkrat.9bab32187c0e8d01@FreeBSD.org> <tkrat.11e402b8455bd0fa@FreeBSD.org> <tkrat.7e60fe1978ea51c0@FreeBSD.org> <20180404174949.GA12271@raichu> <tkrat.5960d3aa016999f6@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 4 Apr, Don Lewis wrote: > On 4 Apr, Mark Johnston wrote: >> On Tue, Apr 03, 2018 at 09:42:48PM -0700, Don Lewis wrote: >>> On 3 Apr, Don Lewis wrote: >>> > I reconfigured my Ryzen box to be more similar to my default package >>> > builder by disabling SMT and half of the RAM, to limit it to 8 cores >>> > and 32 GB and then started bisecting to try to track down the problem. >>> > For each test, I first filled ARC by tarring /usr/ports/distfiles to >>> > /dev/null. The commit range that I was searching was r329844 to >>> > r331716. I narrowed the range to r329844 to r329904. With r329904 >>> > and newer, ARC is totally unresponsive to memory pressure and the >>> > machine pages heavily. I see ARC sizes of 28-29GB and 30GB of wired >>> > RAM, so there is not much leftover for getting useful work done. Active >>> > memory and free memory both hover under 1GB each. Looking at the >>> > commit logs over this range, the most likely culprit is: >>> > >>> > r329882 | jeff | 2018-02-23 14:51:51 -0800 (Fri, 23 Feb 2018) | 13 lines >>> > >>> > Add a generic Proportional Integral Derivative (PID) controller algorithm and >>> > use it to regulate page daemon output. >>> > >>> > This provides much smoother and more responsive page daemon output, anticipating >>> > demand and avoiding pageout stalls by increasing the number of pages to match >>> > the workload. This is a reimplementation of work done by myself and mlaier at >>> > Isilon. >>> > >>> > >>> > It is quite possible that the recent fixes to the PID controller will >>> > fix the problem. Not that r329844 was trouble free ... I left tar >>> > running over lunchtime to fill ARC and the OOM killer nuked top, tar, >>> > ntpd, both of my ssh sessions into the machine, and multiple instances >>> > of getty while I was away. I was able to log in again and successfully >>> > run poudriere, and ARC did respond to the memory pressure and cranked >>> > itself down to about 5 GB by the end of the run. I did not see the same >>> > problem with tar when I did the same with r329904. >>> >>> I just tried r331966 and see no improvement. No OOM process kills >>> during the tar run to fill ARC, but with ARC filled, the machine is >>> thrashing itself at the start of the poudriere run while trying to build >>> ports-mgmt/pkg (39 minutes so far). ARC appears to be unresponsive to >>> memory demand. I've seen no decrease in ARC size or wired memory since >>> starting poudriere. >> >> Re-reading the ARC reclaim code, I see a couple of issues which might be >> at the root of the behaviour you're seeing. >> >> 1. zfs_arc_free_target is too low now. It is initialized to the page >> daemon wakeup threshold, which is slightly above v_free_min. With the >> PID controller, the page daemon uses a setpoint of v_free_target. >> Moreover, it now wakes up regularly rather than having wakeups be >> synchronized by a mutex, so it will respond quickly if the free page >> count dips below v_free_target. The free page count will dip below >> zfs_arc_free_target only in the face of sudden and extreme memory >> pressure now, so the FMT_LOTSFREE case probably isn't getting >> exercised. Try initializing zfs_arc_free_target to v_free_target. > > Changing zfs_arc_free_target definitely helps. My previous poudriere > run failed when poudriere timed out the ports-mgmt/pkg build after two > hours. After changing this setting, poudriere seems to be running > properly and ARC has dropped from 29GB to 26GB ten minutes into the run > and I'm not seeing processes in the swread state. > >> 2. In the inactive queue scan, we used to compute the shortage after >> running uma_reclaim() and the lowmem handlers (which includes a >> synchronous call to arc_lowmem()). Now it's computed before, so we're >> not taking into account the pages that get freed by the ARC and UMA. >> The following rather hacky patch may help. I note that the lowmem >> logic is now somewhat broken when multiple NUMA domains are >> configured, however, since it fires only when domain 0 has a free >> page shortage. > > I will try this next. The patch by itself is not sufficient to fix the problem for me. I didn't have any problems with using the patch as well as setting zfs_arc_free_target. As a matter of fact, that was the only poudriere run where I didn't have a guile-related build failure. Those tend to be fairly random, so it could just be luck. Performance-wise r331966 + zfs_arc_free_target completes the poudriere run about 2.6% faster than r329844. But I don't know if this is the PID controller or something else that changed in base over that interval.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?tkrat.159e0d3961e44e63>