Date: Fri, 29 Aug 2014 21:27:56 +0100 From: "Steven Hartland" <killing@multiplay.co.uk> To: "Steven Hartland" <smh@freebsd.org>, "Alan Cox" <alc@rice.edu>, "Peter Wemm" <peter@wemm.org> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Dmitry Morozovsky <marck@rinet.ru>, "Matthew D. Fuller" <fullermd@over-yonder.net> Subject: Re: svn commit: r270759 - in head/sys: cddl/compat/opensolaris/kern cddl/compat/opensolaris/sys cddl/contrib/opensolaris/uts/common/fs/zfs vm Message-ID: <4264EDE767E54F8D8E2FFDCE4AD70CD8@multiplay.co.uk> References: <201408281950.s7SJo90I047213@svn.freebsd.org> <20140828211508.GK46031@over-yonder.net> <53FFAD79.7070106@rice.edu> <1617817.cOUOX4x8n2@overcee.wemm.org> <4A4B2C2D36064FD9840E3603D39E58E0@multiplay.co.uk> <5400B052.6030103@rice.edu> <93F9465BF50A428BA687DD1EA4A7B455@multiplay.co.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
----- Original Message ----- From: "Steven Hartland" <smh@freebsd.org> > ----- Original Message ----- > From: "Alan Cox" <alc@rice.edu> > >> You didn't really address Peter's initial technical issue. Peter >> correctly observed that cache pages are just another flavor of free >> pages. Whenever the VM system is checking the number of free pages >> against any of the thresholds, it always uses the sum of >> v_cache_count >> and v_free_count. So, to anyone familiar with the VM system, like >> Peter, what you've done, which is to derive a threshold from >> v_free_target but only compare v_free_count to that threshold, looks >> highly suspect. >> >> That said, I can easily believe that your patch works better than the >> existing code, because it is closer in spirit to my interpretation of >> what the Solaris code does. Specifically, I believe that the Solaris >> code starts trimming the ARC before the Solaris page daemon starts >> writing dirty pages to secondary storage. Now, you've made FreeBSD >> do >> the same. However, you've expressed it in a way that looks broken. >> >> To wrap up, I think that you can easily write this in a way that >> simultaneously behaves like Solaris and doesn't look wrong to a VM >> expert. > > More details in my last reply on this but in short it seems this has > already been tried and it didn't work. > > I'd be interested in what domain experts think about why that is? > > In the mean time I've asked Karl to see if he could retest with > this change to confirm counting cache pages along with free does > indeed still cause a problem. > > For those that want to catch up on what has already tested see the > original PR: > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=187594 Copying in Karl's response from the PR for easy access: I can give it a shot on my test system but I will note that for my production machine right now that would be a no-op as there are no cache pages in use. Getting that on the production system with a high level of load over the holiday is rather unlikely due to it being Labor Day, so all I will have are my synthetics. I do have some time over the long weekend to run that test and should be able to do so if you think it's useful, but I am wary of that approach being correct in the general case due to my previous experience. My commentary on that discussion point and reasoning originally is here (http://lists.freebsd.org/pipermail/freebsd-fs/2014-March/019084.html); along that thread you'll see a vmstat output with the current paradigm showing that cache doesn't grow without boundary, as many suggested it would if I didn't include those pages as "free" and thus available for ARC to attempt to invade (effectively trying to force the VM system to evict them from the cache list by having it wake up first.) Indeed, up at comment #31 is the patch on that production system that has been up for about four months and you can see only a few pages are in the cached state. That is fairly typical. What is the expected reason for concern about including them in the free count for ARC (when they're not in terms of what wakes up the VM system as a whole.) If the expected area of concern is that cache pages will grow without boundary (or close to it) the evidence from both my production machine and others strongly suggests that's incorrect. My work suggests strongly that the most-likely to be correct behavior for the majority of workloads is achieved when both the ARC paring routine and VM page-cleaning system wake up at the same time. That occurs when vm_cnt.v_free_count is invaded. Attempting to bias the outcome to force either the VM system or the ARC cache to do something first appears to increase the circumstances under which bad behavior occurs.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4264EDE767E54F8D8E2FFDCE4AD70CD8>