Date: Fri, 28 May 2010 08:51:47 -0700 From: 'Jeremy Chadwick' <freebsd@jdc.parodius.com> To: Howard Leadmon <howard@leadmon.net> Cc: amd64@freebsd.org, freebsd-fs@freebsd.org, 'Andriy Gapon' <avg@icyb.net.ua> Subject: Re: FreeBSD 8.1-Prerelease Panic amd64 w/ZFS.. Message-ID: <20100528155147.GA77427@icarus.home.lan> In-Reply-To: <064701cafe75$537c1530$fa743f90$@net> References: <060401cafe37$a411b240$ec3516c0$@net> <4BFF894F.4010008@icyb.net.ua> <20100528134549.GA75411@icarus.home.lan> <064701cafe75$537c1530$fa743f90$@net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, May 28, 2010 at 10:51:59AM -0400, Howard Leadmon wrote: > Thanks Jeremy, I will try your recommended settings provided above. > > To the other poster, as to the settings of kmem, I had nothing specific > set, just whatever FBSD was using by default. vfs.zfs.arc_max is calculated on-the-fly during ZFS module/init time, I believe, unless you explicitly set a value in loader.conf. vm.kmem_size is similar in that regard. I do not know the calculation formulas. vm.kmem_size_max is more or less static on amd64, because it represents the maximum amount of kmem usable/addressable. You can ignore the next few paragraphs if you don't care about the history of this tunable, but it will probably help folks reading the list. This is what I've figured out mostly on my own. <history> Prior to February 2009 this value was significantly smaller due to VM design/implementation issues. Alan Cox (not the Linux guy) did the necessary work to fix this problem in RELENG_7 and committed things then. Fast forward to October 2009, by which time there were hundreds of posts from users/SAs talking about ZFS, stability problems, and the dreaded "kmem map is too small" error. I sent the following to -stable: http://lists.freebsd.org/pipermail/freebsd-stable/2009-October/052256.html The first thing you're going to notice is that I'm talking about RELENG_7, and specifically amd64. However, the exact same code/efforts (see above) was committed to RELENG_8 simultaneously (or within a very short period of time). So any RELENG_[78] amd64 system with sources from 2009/02 or later should have a very large vm.kmem_size_max. I can confirm this on the couple RELENG_7 systems we have in production. The second thing that you'll notice is one of the links in my mail: it points to a post from pjd@ stating that on amd64 you need to adjust vm.kmem_size, not vm.kmem_size_max. Take note of when this was said: September 2009. This was *after* Alan Cox's work, and I'm certain Pawel had that in mind. Fast forward to... I'm not sure what date; sometime in mid or late 2009. The behaviour of vfs.zfs.arc_max is changed so that it becomes a *hard limit* rather than a "high-water mark" like it was previously. I'm also not sure if this behaviour changed in just RELENG_8 or RELENG_7. My brain is full for a lot of different reasons; I try hard to remember as much as I can but it's too much for one person. </history> Starting to see where all the confusion comes from? :-) Fast forward to today. People are still complaining about the problem, but when they do they usually don't provide enough details. Why? Because they don't know what details to provide. And why is that? Because people expect ZFS on FreeBSD to mimic Solaris 10 or OpenSolaris, where it "just works" (I know because we use it at my workplace on thousands of boxes). You tell users "well, you have to tune loader.conf" and they say "WHY?". You tell them what to tune and they ask "What values do I pick?", which vary from system to system and its workload. There's really no "magic number". Getting FreeBSD to that stage is difficult from what I understand (I believe John Baldwin and a few others have covered this topic). There are efforts underway to eventually solve this problem down the road. Anyway, until then -- I've offered this in the past and I'll offer it again: I'm 100% willing to sit down and write a document that could go into the Handbook that covers ZFS tuning on FreeBSD, why it's necessary (at this point in time), what values are needed, yadda yadda. But I can't write this for the same reason the ZFS section on the FreeBSD Wiki is outdated -- because to get answers to some of the questions, one needs the kernel folks working on this code to help provide answers. Most of us (myself included) are not familiar with the inner-workings of the ZFS port, nor are we fully familiar with that of the VM. The documentation dudes need the kernel dudes. :-) Back to the rest of your mail: > In loader.conf all I had was: > > zfs_load="YES" > vfs.root.mountfrom="zfs:tank/root" > > As to the setting of kmem and arc, I had the following which I will assume > were defaults or auto-tunes: > > vfs.zfs.arc_max : 862653440 > vm.kmem_size : 1380245504 > vm.kmem_size_max: 329853485875 > > {...below taken from your earlier mails...} > > panic:kmem_malloc(131072):kmem_map to small: 1296826368 total allocated I'm not entirely sure, but I think vfs.zfs.arc_max, if not explicitly set in loader.conf, might still act as a "high-water mark". Meaning, it's possible for the ZFS ARC to still exceed vm.kmem_size and cause a panic. Setting the arc_max value explicitly in loader.conf probably forces a hard limit, but I'm not sure. Can someone validate this? I'm basing it on the fact that 1,296,826,368 exceeds 862,653,440, and *probably* was attempting to exceed 1,380,245,504. What I do know is that by setting the two parameters I provided, I can bang on a RELENG_8 box and watch kstat.zfs.misc.arcstats.size never exceed vfs.zfs.arc_max, and the box never panics. All our systems in production, and my two at home, are tuned this way. > I guess while we are all on the subject, I notice in the dmesg log the > message: > > ZFS NOTICE: Prefetch is disabled by default if less than 4GB of RAM is > present; > to enable, add "vfs.zfs.prefetch_disable=0" to > /boot/loader.conf. > > Is this anything I want to enable, that's like a big performance win, or do > I just not have enough RAM to support it? Always been kinda curious about > it, but so far I am liking ZFS, well outside of the machine panic.. LOL And now you've touched on the *other* thing I've ranted about: how that message isn't accurate (nor have previous incarnations). Rather than explain it here, you can just read my blog entry about this message and hopefully what I've written will suffice for an explanation (see bottom half of the post). http://koitsu.wordpress.com/2009/10/12/testing-out-freebsd-8-0-rc1/ As for "should I actually enable this?" -- I've been in a private conversation with another FreeBSD user about this, and like me, he isn't sure either. Where did this arbitrary limit come from, and why are we being warned about it? Where can we read about the decision? This circles back to what I said earlier -- if documentation can't be provided, at bare minimum some explanations given in src/UPDATING would be sufficient. That's about all I can say on the matter. I do what I can, but the ability to accomplish what's needed is mostly out of my control. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100528155147.GA77427>