Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 18 Apr 2009 17:17:00 -0400
From:      Ben Kelly <ben@wanderview.com>
To:        Alexander Leidinger <Alexander@Leidinger.net>
Cc:        current@freebsd.org, fs@freebsd.org
Subject:   Re: ZFS: unlimited arc cache growth?
Message-ID:  <6535218D-6292-4F84-A8BA-FFA9B2E47F80@wanderview.com>
In-Reply-To: <20090418094821.00002e67@unknown>
References:  <20090417145024.205173ighmwi4j0o@webmail.leidinger.net> <D2B1B82F-AFCF-4161-BB9E-316EC976E360@wanderview.com> <20090418094821.00002e67@unknown>

next in thread | previous in thread | raw e-mail | index | archive | help
On Apr 18, 2009, at 3:48 AM, Alexander Leidinger wrote:
> On Fri, 17 Apr 2009 10:04:15 -0400 Ben Kelly <ben@wanderview.com>  
> wrote:
> I haven't tried killing pkgdb and looking at the stats, but on the  
> idle
> machine (reboot after the panic and 5h of no use by me... the machine
> fetches my mails, has a webmail + mysql + imap interface and is a
> fileserver) the size is double of my max value. Again there's no real
> load at this time, just fetching my mails (most traffic from the
> FreeBSD lists) and a little bit of SpamAssassin filtering of them.  
> When
> I logged in this morning the machine was rebooted about 5h ago by a
> panic and no FS traffic was going on (100% idle).

 From looking at the code, its not too surprising it settles out at 2x  
your zfs_arc_max tunable.  It looks like under normal conditions the  
arc_reclaim_thread only tries to evict buffers when the arc_size plus  
any ghost buffers is twice the value of arc_c:

                 if (needfree ||
                     (2 * arc_c < arc_size +
                     arc_mru_ghost->arcs_size + arc_mfu_ghost- 
 >arcs_size))
                         arc_adjust();

(The needfree flag is only set when the system lowmem event is  
fired.)  The arc_reclaim_thread checks this once a second.  Perhaps  
this limit should be a tunable.  Also, it might make sense to have a  
separate limit check for the ghost buffers.

I was able to reproduce similar arc_size growth on my machine by  
running my rsync backup.  After instrumenting the code it appeared  
that buffers were not being evicted because they were "indirect" and  
had been in the cache less than a second.  The "indirect" flag is set  
based on the on-disk level field.  When you see the  
arcstats.evict_skip sysctl going up this is probably what is  
happening.  The comments in the code say this check is only for  
prefetch data, but it also triggers for indirect.  I'm hesitant to  
make it really only affect prefetch buffers.  Perhaps we could make  
the timeout a tunable or dynamic based on how far the cache is over  
its target.

After the rsync completed my machine slowly evicts buffers until its  
back down to about twice arc_c.  There was one case, however, where I  
saw it stop at about four times arc_c.  In that case it was failing to  
evict buffers due to a missed lock.  Its not clear yet if it was a  
buffer lock or hash lock.  When this happens you'll see the  
arcstats.mutex_missed sysctl go up.  I'm going to see if I can track  
down why this is occuring under idle conditions.  That seems  
suspicious to me.

Hope that helps.  I'll let you know if I find anything else.

- Ben



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6535218D-6292-4F84-A8BA-FFA9B2E47F80>