Date: Fri, 24 Feb 2012 10:03:46 -0800 From: Ian Downes <ian@ndwns.net> To: Luke Marsden <luke-lists@hybrid-logic.co.uk> Cc: Tom Evans <tevans.uk@googlemail.com>, freebsd-fs@freebsd.org, team@hybrid-logic.co.uk Subject: Re: Another ZFS ARC memory question Message-ID: <20120224180346.GA83845@weta.local> In-Reply-To: <1330090934.13430.90.camel@pow> References: <1330081612.13430.39.camel@pow> <CAFHbX1KPW%2B4h2-LHE9rB0aVRqw%2BAzVDrjjVB2CCt=7T4JB8C3A@mail.gmail.com> <1330087470.13430.61.camel@pow> <CAFHbX1JA9HdF59_NAXzy3R%2BZGN9CFrTWcbYq4ajBjvD_WTBTwA@mail.gmail.com> <1330090934.13430.90.camel@pow>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Feb 24, 2012 at 01:42:14PM +0000, Luke Marsden wrote: > On Fri, 2012-02-24 at 12:59 +0000, Tom Evans wrote: > > On Fri, Feb 24, 2012 at 12:44 PM, Luke Marsden > > <luke-lists@hybrid-logic.co.uk> wrote: > > > On Fri, 2012-02-24 at 12:21 +0000, Tom Evans wrote: > > >> On Fri, Feb 24, 2012 at 11:06 AM, Luke Marsden > > >> <luke-lists@hybrid-logic.co.uk> wrote: > > >> > Hi all, > > >> > > > >> > Just wanted to get your opinion on best practices for ZFS. > > >> > > > >> > We're running 8.2-RELEASE v15 in production on 24GB RAM amd64 machines > > >> > but have been having trouble with short spikes in application memory > > >> > usage resulting in huge amounts of swapping, bringing the whole machine > > >> > to its knees and crashing it hard. I suspect this is because when there > > >> > is a sudden spike in memory usage the zfs arc reclaim thread is unable > > >> > to free system memory fast enough. > > >> > > > >> > This most recently happened yesterday as you can see from the following > > >> > munin graphs: > > >> > > > >> > E.g. http://hybrid-logic.co.uk/memory-day.png > > >> > http://hybrid-logic.co.uk/swap-day.png > > >> > > > >> > Our response has been to start limiting the ZFS ARC cache to 4GB on our > > >> > production machines - trading performance for stability is fine with me > > >> > (and we have L2ARC on SSD so we still get good levels of caching). > > >> > > > >> > My questions are: > > >> > > > >> > * is this a known problem? > > >> > * what is the community's advice for production machines running > > >> > ZFS on FreeBSD, is manually limiting the ARC cache (to ensure > > >> > that there's enough actually free memory to handle a spike in > > >> > application memory usage) the best solution to this > > >> > spike-in-memory-means-crash problem? > > >> > * has FreeBSD 9.0 / ZFS v28 solved this problem? > > >> > * rather than setting a hard limit on the ARC cache size, is it > > >> > possible to adjust the auto-tuning variables to leave more free > > >> > memory for spiky memory situations? e.g. set the auto-tuning to > > >> > make arc eat 80% of memory instead of ~95% like it is at > > >> > present? > > >> > * could the arc reclaim thread be made to drop ARC pages with > > >> > higher priority before the system starts swapping out > > >> > application pages? > > >> > > > >> > Thank you for any/all answers, and thank you for making FreeBSD > > >> > awesome :-) > > >> > > >> It's not a problem, it's a feature! > > >> > > >> By default the ARC will attempt to cache as much as it can - it > > >> assumes the box is a ZFS filer, and doesn't need RAM for applications. > > >> The solution, as you've found out, is to limit how much ARC can take > > >> up. > > >> > > >> In practice, you should be doing this anyway. You should know, or have > > >> an idea, of how much RAM is required for the applications on that box, > > >> and you need to limit ZFS to not eat into that required RAM. > > > > > > Thanks for your reply, Tom! I agree that the ARC cache is a great > > > feature, but for a general purpose filesystem it does seem like a > > > reasonable expectation that filesystem cache will be evicted before > > > application data is swapped, even if the spike in memory usage is rather > > > aggressive. A complete server crash in this scenario is rather > > > unfortunate. > > > > > > My question stands - is this an area which has been improved on in the > > > ZFS v28 / FreeBSD 9.0 / upcoming FreeBSD 8.3 code, or should it be > > > standard practice to guess how much memory the applications running on > > > the server might need and set the arc_max boot.loader tweak > > > appropriately? This is reasonably tricky when providing general purpose > > > web application hosting and so we'll often end up erring on the side of > > > caution and leaving lots of RAM free "just in case". > > > > > > If the latter is indeed the case in the latest stable releases then I > > > would like to update http://wiki.freebsd.org/ZFSTuningGuide which > > > currently states: > > > > > > FreeBSD 7.2+ has improved kernel memory allocation strategy and > > > no tuning may be necessary on systems with more than 2 GB of > > > RAM. > > > > > > Thank you! > > > > > > Best Regards, > > > Luke Marsden > > > > > > > Hmm. That comment is really talking about that you no longer need to > > tune vm.kmem_size. > > http://wiki.freebsd.org/ZFSTuningGuide > > "No tuning may be necessary" seems to indicate that no changes need to > be made to boot.loader. I'm happy to provide a patch for the wiki which > makes it clearer that for servers which may experience sudden spikes in > application memory usage (i.e. all servers running user-supplied > applications), the speed of ARC eviction is insufficient to ensure > stability and arc_max should be tuned downwards. > > > I get what you are saying about applications suddenly using a lot of > > RAM should not cause the server to fall over. Do you know why it fell > > over? IE, was it a panic, a deadlock, etc. > > If you look at the http://hybrid-logic.co.uk/swap-day.png graph you can > see a huge spike in swap at the point at which the last line of pixels > at http://hybrid-logic.co.uk/memory-day.png indicates the sudden > increase in memory usage (by 3GB in active memory usage if you look > closely). Since the graph stops at that point it indicates that the > server became completely unresponsive (e.g. including munin probe > requests). I did manage to log in just before it became completely > unresponsive, but at that point the incoming requests weren't being > serviced fast enough due to the excessive swapping and the server > eventually became completely unresponsive (e.g. 'top' output froze and > never came back). It continued to respond to pings though and may have > eventually recovered if I had disabled inbound network traffic. I don't > have any evidence of a panic or deadlock, we just hard rebooted the > machine about 15 minutes later after it failed to recover from the > swap-storm. > > > FreeBSD does not cope well when you have used up all RAM and swap > > (well, what does?), and from your graphs it does look like the ARC is > > not super massive when you had the problem - around 30-40% of RAM? > > The last munin sample indicates roughly 8.5GB ARC out of 24GB, so yes, > 35%. I guess what I'd like is for FreeBSD to detect an emergency > out-of-memory condition and aggressively drop much or all of the ARC > cache *before* swapping out application memory which causes the system > to grind to a halt. > > Is this a reasonable request, and is there anything I can do to help > implement it? > > If not can we update the wiki to make it clearer that ARC limiting is > necessary, even with high RAM boxes, to ensure stability under spiky > memory conditions? > Are you sure that it is the ARC data that is causing the issue? I've got boxes where the ARC *meta* skyrockets and consumes all RAM, greatly exceeding the arc_meta_limit. E.g. on a very unresponsive local box: vfs.zfs.arc_meta_limit: 1610612736 vfs.zfs.arc_meta_used: 12183379056 Setting arc_max helps (and seems to be respected), but I don't know why arc_meta_used exceeds arc_meta_limit. Ian
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120224180346.GA83845>