Date: Sat, 18 Apr 2009 17:17:00 -0400 From: Ben Kelly <ben@wanderview.com> To: Alexander Leidinger <Alexander@Leidinger.net> Cc: current@freebsd.org, fs@freebsd.org Subject: Re: ZFS: unlimited arc cache growth? Message-ID: <6535218D-6292-4F84-A8BA-FFA9B2E47F80@wanderview.com> In-Reply-To: <20090418094821.00002e67@unknown> References: <20090417145024.205173ighmwi4j0o@webmail.leidinger.net> <D2B1B82F-AFCF-4161-BB9E-316EC976E360@wanderview.com> <20090418094821.00002e67@unknown>
next in thread | previous in thread | raw e-mail | index | archive | help
On Apr 18, 2009, at 3:48 AM, Alexander Leidinger wrote: > On Fri, 17 Apr 2009 10:04:15 -0400 Ben Kelly <ben@wanderview.com> > wrote: > I haven't tried killing pkgdb and looking at the stats, but on the > idle > machine (reboot after the panic and 5h of no use by me... the machine > fetches my mails, has a webmail + mysql + imap interface and is a > fileserver) the size is double of my max value. Again there's no real > load at this time, just fetching my mails (most traffic from the > FreeBSD lists) and a little bit of SpamAssassin filtering of them. > When > I logged in this morning the machine was rebooted about 5h ago by a > panic and no FS traffic was going on (100% idle). From looking at the code, its not too surprising it settles out at 2x your zfs_arc_max tunable. It looks like under normal conditions the arc_reclaim_thread only tries to evict buffers when the arc_size plus any ghost buffers is twice the value of arc_c: if (needfree || (2 * arc_c < arc_size + arc_mru_ghost->arcs_size + arc_mfu_ghost- >arcs_size)) arc_adjust(); (The needfree flag is only set when the system lowmem event is fired.) The arc_reclaim_thread checks this once a second. Perhaps this limit should be a tunable. Also, it might make sense to have a separate limit check for the ghost buffers. I was able to reproduce similar arc_size growth on my machine by running my rsync backup. After instrumenting the code it appeared that buffers were not being evicted because they were "indirect" and had been in the cache less than a second. The "indirect" flag is set based on the on-disk level field. When you see the arcstats.evict_skip sysctl going up this is probably what is happening. The comments in the code say this check is only for prefetch data, but it also triggers for indirect. I'm hesitant to make it really only affect prefetch buffers. Perhaps we could make the timeout a tunable or dynamic based on how far the cache is over its target. After the rsync completed my machine slowly evicts buffers until its back down to about twice arc_c. There was one case, however, where I saw it stop at about four times arc_c. In that case it was failing to evict buffers due to a missed lock. Its not clear yet if it was a buffer lock or hash lock. When this happens you'll see the arcstats.mutex_missed sysctl go up. I'm going to see if I can track down why this is occuring under idle conditions. That seems suspicious to me. Hope that helps. I'll let you know if I find anything else. - Ben
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6535218D-6292-4F84-A8BA-FFA9B2E47F80>