Date: Wed, 20 Feb 2013 10:07:11 -0600 From: Kevin Day <toasty@dragondata.com> To: Peter Jeremy <peter@rulingia.com> Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org> Subject: Re: Improving ZFS performance for large directories Message-ID: <2F90562A-7F98-49A5-8431-4313961EFA70@dragondata.com> In-Reply-To: <20130220082828.GA44920@server.rulingia.com> References: <19DB8F4A-6788-44F6-9A2C-E01DEA01BED9@dragondata.com> <CAJjvXiE%2B8OMu_yvdRAsWugH7W=fhFW7bicOLLyjEn8YrgvCwiw@mail.gmail.com> <F4420A8C-FB92-4771-B261-6C47A736CF7F@dragondata.com> <20130201192416.GA76461@server.rulingia.com> <19E0C908-79F1-43F8-899C-6B60F998D4A5@dragondata.com> <20130220082828.GA44920@server.rulingia.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Feb 20, 2013, at 2:28 AM, Peter Jeremy <peter@rulingia.com> wrote: >> Thinking I'd make the primary cache metadata only, and the secondary >> cache "all" would improve things, >=20 > This won't work as expected. L2ARC only caches data coming out of ARC > so by setting ARC to cache metadata only, there's never any "data" in > ARC and hence never any evicted from ARC to L2ARC. >=20 That makes sense, I wasn't sure if it was smart enough to realize this = happening or not, but I guess it won't work. >> I wiped the device (SATA secure erase to make sure) >=20 > That's not necessary. L2ARC doesn't survive reboots because all teh > L2ARC "metadata" is in ARC only. This does mean that it takes quite > a while for L2ARC to warm up following a reboot. >=20 I was more concerned with the SSD's performance than ZFS caring what was = there. A few cases completely filled the SSD, which can slow things down = (there are no free blocks for it to use). Secure Erase will reset it so = the drive's controller knows EVERYTHING is really free. We have one = model of SSD here that will drop to about 5% of it's original = performance after every block on the drive has been written to once. = We're not using that model anymore, but I still like to be sure. :) >> There are roughly 29M files, growing at about 50k files/day. We >> recently upgraded, and are now at 96 3TB drives in the pool.=20 >=20 > That number of files isn't really excessive but it sounds like your > workload has very low locality. At this stage, my suggestions are: > 1) Disable atime if you don't need it & haven't already. > Otherwise file accesses are triggering metadata updates. > 2) Increase vfs.zfs.arc_meta_limit > You're still getting more metadata misses than data misses > 3) Increase your ARC size (more RAM) > Your pool is quite large compared to your RAM. >=20 Yeah, I think the locality is basically zero. It's multiple rsyncs = running across the entire filesystem repeatedly. Each directory is only = going to be touched once per pass through, so that isn't really going to = benefit much from cache unless we get lucky and two rsyncs come in = back-to-back where one is chasing another. Atime is already off globally - nothing we use needs it. We are at the = limit for RAM for this motherboard, so any further increases are going = to be quite expensive.=20 >=20 >> Is there any way to tell why more metadata isn't >> being pushed to the L2ARC? >=20 > ZFS treats writing to L2ARC very much as an afterthought. L2ARC = writes > are rate limited by vfs.zfs.l2arc_write_{boost,max} and will be = aborted > if they might interfere with a read. I'm not sure how to improve it. >=20 At this stage there are just zero writes being done, so perhaps the = problem is that with so much pressure on the arc metadata, nothing is = getting a chance to get pushed into the L2ARC. I'm going to try to = increase the meta limit on ARC, but there's not a great deal more I can = do. > Since this is all generic ZFS, you might like to try asking on > zfs@lists.illumos.org as well. Some of the experts there might have > some ideas. I will try that, thanks! -- Kevin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2F90562A-7F98-49A5-8431-4313961EFA70>