From owner-svn-src-head@FreeBSD.ORG Fri Aug 29 16:54:51 2014 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A3ADB694; Fri, 29 Aug 2014 16:54:51 +0000 (UTC) Received: from pp2.rice.edu (proofpoint2.mail.rice.edu [128.42.201.101]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 63D541813; Fri, 29 Aug 2014 16:54:50 +0000 (UTC) Received: from pps.filterd (pp2.rice.edu [127.0.0.1]) by pp2.rice.edu (8.14.5/8.14.5) with SMTP id s7TGqhBa006590; Fri, 29 Aug 2014 11:54:44 -0500 Received: from mh11.mail.rice.edu (mh11.mail.rice.edu [128.42.199.30]) by pp2.rice.edu with ESMTP id 1p2f3j8ar0-1; Fri, 29 Aug 2014 11:54:43 -0500 X-Virus-Scanned: by amavis-2.7.0 at mh11.mail.rice.edu, auth channel Received: from 108-254-203-201.lightspeed.hstntx.sbcglobal.net (108-254-203-201.lightspeed.hstntx.sbcglobal.net [108.254.203.201]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: alc) by mh11.mail.rice.edu (Postfix) with ESMTPSA id 393254C0096; Fri, 29 Aug 2014 11:54:43 -0500 (CDT) Message-ID: <5400B052.6030103@rice.edu> Date: Fri, 29 Aug 2014 11:54:42 -0500 From: Alan Cox User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Steven Hartland , Peter Wemm Subject: Re: svn commit: r270759 - in head/sys: cddl/compat/opensolaris/kern cddl/compat/opensolaris/sys cddl/contrib/opensolaris/uts/common/fs/zfs vm References: <201408281950.s7SJo90I047213@svn.freebsd.org> <20140828211508.GK46031@over-yonder.net> <53FFAD79.7070106@rice.edu> <1617817.cOUOX4x8n2@overcee.wemm.org> <4A4B2C2D36064FD9840E3603D39E58E0@multiplay.co.uk> In-Reply-To: <4A4B2C2D36064FD9840E3603D39E58E0@multiplay.co.uk> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 kscore.is_bulkscore=5.07927033766009e-14 kscore.compositescore=6.83841872017865e-11 circleOfTrustscore=0 compositescore=0.601496849000349 urlsuspect_oldscore=0.00149684900034924 suspectscore=11 recipient_domain_to_sender_totalscore=0 phishscore=0 bulkscore=0 kscore.is_spamscore=0 recipient_to_sender_totalscore=0 recipient_domain_to_sender_domain_totalscore=0 rbsscore=0.601496849000349 spamscore=0 recipient_to_sender_domain_totalscore=0 urlsuspectscore=0.9 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=7.0.1-1402240000 definitions=main-1408290181 Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Dmitry Morozovsky , "Matthew D. Fuller" X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Aug 2014 16:54:51 -0000 On 08/29/2014 03:32, Steven Hartland wrote: >> On Thursday 28 August 2014 17:30:17 Alan Cox wrote: >> > On 08/28/2014 16:15, Matthew D. Fuller wrote: >> > > On Thu, Aug 28, 2014 at 10:11:39PM +0100 I heard the voice of >> > > >> > > Steven Hartland, and lo! it spake thus: >> > >> Its very likely applicable to stable/9 although I've never used 9 >> > >> myself, we jumped from 9 direct to 10. >> > > >> > > This is actually hitting two different issues from the two bugs: >> > > >> > > - 191510 is about "ARC isn't greedy enough" on huge-memory > > >> machines, >> > > >> > > and from the osreldate that bug was filed on 9.2, so presumably >> > > is >> > > applicable. >> > > >> > > - 187594 is about "ARC is too greedy" (probably mostly on > > >> not-so-huge >> > > >> > > machines) and starves/drives the rest of the system into swap. >> > > That >> > > I believe came about as a result of some unrelated change in the >> > > 10.x stream that upset the previous balance between ARC and the >> > > rest >> > > of the VM, so isn't a problem on 9.x. >> > >> > 10.0 had a bug in the page daemon that was fixed in 10-STABLE about >> > three months ago (r265945). The ARC was not the only thing >> affected > by >> this bug. >> >> I'm concerned about potential unintended consequences of this change. >> >> Before, arc reclaim was driven by vm_paging_needed(), which was: >> vm_paging_needed(void) >> { >> return (vm_cnt.v_free_count + vm_cnt.v_cache_count < >> vm_pageout_wakeup_thresh); >> } >> >> Now it's ignoring the v_cache_count and looking exclusively at >> v_free_count. >> "cache" pages are free pages that just happen to have known contents. >> If I >> read this change right, zfs arc will now discard checksummed cache >> pages to >> make room for non-checksummed pages: > > That test is still there so if it needs to it will still trigger. > > However that often a lower level as vm_pageout_wakeup_thresh is only 110% > of min free, where as zfs_arc_free_target is based of target free > which is > 4 * (min free + reserved). > >> + if (kmem_free_count() < zfs_arc_free_target) { >> + return (1); >> + } >> ... >> +kmem_free_count(void) >> +{ >> + return (vm_cnt.v_free_count); >> +} >> >> This seems like a pretty substantial behavior change. I'm concerned >> that it >> doesn't appear to count all the forms of "free" pages. >> >> I haven't seen the problems with the over-aggressive ARC since the >> page daemon >> bug was fixed. It's been working fine under pretty abusive loads in >> the freebsd >> cluster after that fix. > > Others have also confirmed that even with r265945 they can still trigger > performance issue. > > In addition without it we still have loads of RAM sat their unused, in my > particular experience we have 40GB of 192GB sitting their unused and that > was with a stable build from last weekend. > The Solaris code only imposed this limit on 32-bit machines where the available kernel virtual address space may be much less than the available physical memory. Previously, FreeBSD imposed this limit on both 32-bit and 64-bit machines. Now, it imposes it on neither. Why continue to do this differently from Solaris? > With the patch we confirmed that both RAM usage and performance for those > seeing that issue are resolved, with no reported regressions. > >> (I should know better than to fire a reply off before full fact >> checking, but >> this commit worries me..) > > Not a problem, its great to know people pay attention to changes, and > raise > their concerns. Always better to have a discussion about potential issues > than to wait for a problem to occur. > > Hopefully the above gives you some piece of mind, but if you still > have any > concerns I'm all ears. > You didn't really address Peter's initial technical issue. Peter correctly observed that cache pages are just another flavor of free pages. Whenever the VM system is checking the number of free pages against any of the thresholds, it always uses the sum of v_cache_count and v_free_count. So, to anyone familiar with the VM system, like Peter, what you've done, which is to derive a threshold from v_free_target but only compare v_free_count to that threshold, looks highly suspect. That said, I can easily believe that your patch works better than the existing code, because it is closer in spirit to my interpretation of what the Solaris code does. Specifically, I believe that the Solaris code starts trimming the ARC before the Solaris page daemon starts writing dirty pages to secondary storage. Now, you've made FreeBSD do the same. However, you've expressed it in a way that looks broken. To wrap up, I think that you can easily write this in a way that simultaneously behaves like Solaris and doesn't look wrong to a VM expert. > Out of interest would it be possible to update machines in the cluster to > see how their workload reacts to the change? > > Regards > Steve > >