Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 04 Oct 2011 20:13:02 +1100
From:      Aristedes Maniatis <ari@ish.com.au>
To:        freebsd-fs@freebsd.org
Subject:   Re: vm.kmem_size_scale recommendation for ZFS
Message-ID:  <4E8ACE1E.4060608@ish.com.au>
In-Reply-To: <4E8A8740.100@ish.com.au>
References:  <4E8A8740.100@ish.com.au>

next in thread | previous in thread | raw e-mail | index | archive | help
> Jeremy Chadwick wrote:
> There are fixes in 8.2-STABLE that address this problem (there were multiple fixes put in place). So if you're already experiencing kmem map exhaustion, I strongly recommend you upgrade to 8.2-STABLE and then adjust one single tunable:
>
> vfs.zfs.arc_max


I'm using freebsd-update on all our servers (there are about a dozen with this type of setup) so moving to -STABLE isn't simple for us. It looks like we are stuck on 8.2-RELEASE plus security patches for now. With no 8.3 on the near horizon, we need to patch things up for the short term.

Also, it looks like the 8.2 errata page has nothing about the issues which have been fixed. [1] That would have made it more useful. :-) Is there a summary of the "multiple fixes"?


> You didn't disclose how much RAM your machine has, nor what other daemons are running on it, so it's very hard to give you an estimate.

Sorry, yes. This particular machine has 24Gb. Some others as little as 8Gb. All are 64bit of course. The workload of the machines are different (mysql, tomcat, httpd, etc) and I understand that will affect tuning. But I'm less interested in tuning here and more interested in making sure the server doesn't crash.


> I tend to tell people to set vfs.zfs.arc_max to about 60% of their memory. E.g. if the machine has 8GB RAM, set vfs.zfs.arc_max="5120M".

That is different to the official wiki [2] which suggests that 8.2 is completely self-tuning and has a reasonable default value. On the 24Gb machine I'm seeing arc_max as 92% of physical memory and vm.kmem_size as 97% of physical memory. From everything I've read, that sounds reasonable since arc should automatically decrease as other applications use up memory, so ZFS should not be a reason for the machine swapping RAM to disk.

hw.physmem:      25744949248
vm.kmem_size:    24956616704  (97%)
hw.realmem:      26843545600 (104% !)
vfs.zfs.arc_max: 23882874880 (92%)
vm.kmem_size_scale: 1

Curious that realmem is exactly 1Gb larger than the actual memory in the machine. And that size_scale = 1 means 97% rather than 100%, but I guess that is some allowance for other kernel buffers.



>  We use this on all of our servers with 8GB, including ones that run mysqld with some tunings. Example machine in question:
>
> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 60030 mysql 16 76 0 785M 294M sigwai 0 15:38 0.00% [mysqld]
>
> And relevant bits from /boot/loader.conf:
>
> kern.maxdsiz="2560M" kern.dfldsiz="2560M" kern.maxssiz="256M" vfs.zfs.arc_max="5120M"
>
> You need to keep in mind that ZFS can still use more than what you limit the ARC to. The mailing list has explanations regarding this; I believe it has to do with fragmentation or something like that. So don't go thinking you're being smart by setting it to something like "7500M"; there is a very good chance you will still experience the same problem. So my advice is to "start small" and work your way up after multiple weeks (not days!) of utilising the filesystems on ZFS, to really stress the ARC.

This system recently has a kernel lockup after several months without having its settings changed. So some particular workload caused the problem. Since it happened at 1am, I expect that backup scripts touching lots of files were the factor that caused the issue to appear.


> Also be aware there may still be problems with applications that use sendfile(2) -- known programs which use this are ftpd (not adjustable), Apache (you can disable it), nginx (I think you can disable it), and Samba (you can disable it). There are some sendfile(2) fixes were which committed, but I'm not 100% sure if all situations have been accounted for.

My understanding is that the sendfile issue is a performance one only, not stability. Is that correct?


But back to the original question. Pawel recommends in his 1 year old blog entry that kmem should be 150% of actual RAM (I don't really understand why, but he is the expert). Andriy committed scale=1 earlier this year which is more like 97% of actual RAM. Which is correct?

I understand how ARC works, but I don't understand why kmem is tunable in ordinary operation or why one value should be preferred.


Thanks
Ari


[1] http://www.freebsd.org/releases/8.2R/errata.html
[2] http://wiki.freebsd.org/ZFSTuningGuide

-- 
-------------------------->
Aristedes Maniatis
ish
http://www.ish.com.au
Level 1, 30 Wilson Street Newtown 2042 Australia
phone +61 2 9550 5001   fax +61 2 9550 4001
GPG fingerprint CBFB 84B4 738D 4E87 5E5C  5EFA EF6A 7D2E 3E49 102A



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4E8ACE1E.4060608>