Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 20 Feb 2013 10:07:11 -0600
From:      Kevin Day <toasty@dragondata.com>
To:        Peter Jeremy <peter@rulingia.com>
Cc:        FreeBSD Filesystems <freebsd-fs@freebsd.org>
Subject:   Re: Improving ZFS performance for large directories
Message-ID:  <2F90562A-7F98-49A5-8431-4313961EFA70@dragondata.com>
In-Reply-To: <20130220082828.GA44920@server.rulingia.com>
References:  <19DB8F4A-6788-44F6-9A2C-E01DEA01BED9@dragondata.com> <CAJjvXiE%2B8OMu_yvdRAsWugH7W=fhFW7bicOLLyjEn8YrgvCwiw@mail.gmail.com> <F4420A8C-FB92-4771-B261-6C47A736CF7F@dragondata.com> <20130201192416.GA76461@server.rulingia.com> <19E0C908-79F1-43F8-899C-6B60F998D4A5@dragondata.com> <20130220082828.GA44920@server.rulingia.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On Feb 20, 2013, at 2:28 AM, Peter Jeremy <peter@rulingia.com> wrote:
>> Thinking I'd make the primary cache metadata only, and the secondary
>> cache "all" would improve things,
>=20
> This won't work as expected.  L2ARC only caches data coming out of ARC
> so by setting ARC to cache metadata only, there's never any "data" in
> ARC and hence never any evicted from ARC to L2ARC.
>=20

That makes sense, I wasn't sure if it was smart enough to realize this =
happening or not, but I guess it won't work.


>> I wiped the device (SATA secure erase to make sure)
>=20
> That's not necessary.  L2ARC doesn't survive reboots because all teh
> L2ARC "metadata" is in ARC only.  This does mean that it takes quite
> a while for L2ARC to warm up following a reboot.
>=20

I was more concerned with the SSD's performance than ZFS caring what was =
there. A few cases completely filled the SSD, which can slow things down =
(there are no free blocks for it to use). Secure Erase will reset it so =
the drive's controller knows EVERYTHING is really free. We have one =
model of SSD here that will drop to about 5% of it's original =
performance after every block on the drive has been written to once. =
We're not using that model anymore, but I still like to be sure. :)

>> There are roughly 29M files, growing at about 50k files/day. We
>> recently upgraded, and are now at 96 3TB drives in the pool.=20
>=20
> That number of files isn't really excessive but it sounds like your
> workload has very low locality.  At this stage, my suggestions are:
> 1) Disable atime if you don't need it & haven't already.
>   Otherwise file accesses are triggering metadata updates.
> 2) Increase vfs.zfs.arc_meta_limit
>   You're still getting more metadata misses than data misses
> 3) Increase your ARC size (more RAM)
>   Your pool is quite large compared to your RAM.
>=20

Yeah, I think the locality is basically zero. It's multiple rsyncs =
running across the entire filesystem repeatedly. Each directory is only =
going to be touched once per pass through, so that isn't really going to =
benefit much from cache unless we get lucky and two rsyncs come in =
back-to-back where one is chasing another.

Atime is already off globally - nothing we use needs it. We are at the =
limit for RAM for this motherboard, so any further increases are going =
to be quite expensive.=20

>=20
>> Is there any way to tell why more metadata isn't
>> being pushed to the L2ARC?
>=20
> ZFS treats writing to L2ARC very much as an afterthought.  L2ARC =
writes
> are rate limited by vfs.zfs.l2arc_write_{boost,max} and will be =
aborted
> if they might interfere with a read.  I'm not sure how to improve it.
>=20

At this stage there are just zero writes being done, so perhaps the =
problem is that with so much pressure on the arc metadata, nothing is =
getting a chance to get pushed into the L2ARC. I'm going to try to =
increase the meta limit on ARC, but there's not a great deal more I can =
do.

> Since this is all generic ZFS, you might like to try asking on
> zfs@lists.illumos.org as well.  Some of the experts there might have
> some ideas.

I will try that, thanks!

-- Kevin




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2F90562A-7F98-49A5-8431-4313961EFA70>