Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 3 May 2011 05:53:16 -0700
From:      Jeremy Chadwick <freebsd@jdc.parodius.com>
To:        Olaf Seibert <O.Seibert@cs.ru.nl>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: Automatic reboot doesn't reboot
Message-ID:  <20110503125316.GA17085@icarus.home.lan>
In-Reply-To: <20110503123015.GZ6733@twoquid.cs.ru.nl>
References:  <20110502143230.GW6733@twoquid.cs.ru.nl> <20110503092113.GA39704@icarus.home.lan> <20110503100854.GY6733@twoquid.cs.ru.nl> <20110503122052.GA13811@icarus.home.lan> <20110503123015.GZ6733@twoquid.cs.ru.nl>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, May 03, 2011 at 02:30:15PM +0200, Olaf Seibert wrote:
> On Tue 03 May 2011 at 05:20:52 -0700, Jeremy Chadwick wrote:
> > To be on the safe side, pick something that's small at first, then work
> > your way up.  You'll need probably 1+ weeks of heavy ZFS I/O between
> > tests (e.g. don't change the tunable, reboot, then 4 hours later declare
> > the new (larger) value as stable).
> 
> Ah, that's important: so far it seemed to me that a *too small* value
> (for all various tunables) would cause problems, but now you're saying
> that *too large* is the problem (at least for vfs.zfs.arc_max)! 

Too small = not-so-great performance (less data in the ARC means more
reads from the disks.  Disks are slower than RAM :-) ).

Too large = increased risk of kmem exhaustion panic.

> This machine has mixed loads; from time to time somebody starts a big
> job with lots of I/O, and in between it is much more modestly loaded.

I would recommend starting small (maybe 1/3rd of your physical RAM?) and
increase from there.  You can try the opposite technique too -- start
large (e.g. 3/4ths of RAM) and wait for a panic.  I'm of the opinion
that I'd rather have a stable system with less memory used for ARC than
a system which could panic and have more memory for ARC.

Sadly there's no 100% reliable way to calculate what's "ideal".  For
example I might use a smaller value than 6144M on a machine where mysqld
is tuned to utilise lots of RAM.  There's a balancing act that goes on
that takes some time to figure out.

For example, on our FreeBSD ZFS-backed NFS filer on our network, I ran
with a 3/4th amount for quite some time (we're talking 4-5 months).
Then suddenly one day I noticed the client machines were complaining
about NFS timeouts, etc...  Got on the filer, lo and behold kmem
exhaustion.  I decreased arc_max by about 1024M and it's been fine
since.

There's a lot of evolution that's occurred in the FreeBSD ZFS kernel
code over the years too.  Originally arc_max was a "high-water mark" of
some sort, but code was changed to make it a hard limit as much as it
could be.  Then some edge cases were found where it could still exceed
the maximum, so those were fixed, etc...  Tracking all the changes is
very difficult (I became very frustrated/irate at having to do so,
wishing that there was more of a "state of ZFS" announcement sent out
every so often so users/admins would know what's changed and adjust
things appropriately), requiring an admin to follow commits.  That's
just the nature of the beast.

> > So for example on an 8GB RAM machine, I might recommend starting with
> > vfs.zfs.arc_max="4096M" and let that run for a while.  If you find your
> > "Wired" value in top(1) remains fairly constant after a week or so of
> > heavy I/O, consider bumping up the value a bit more (say 4608M).
> 
> I'll do just that.

Let us know how things turn out.  Follow-ups that indicate things are
working are just as important as initial mails stating things aren't,
especially if you're someone searching the Web to try and find an answer
to what this kmem thing is all about.....  :-)

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.               PGP 4BD6C0CB |




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110503125316.GA17085>