From owner-freebsd-fs@FreeBSD.ORG Wed Feb 20 08:28:47 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 91B6C342 for ; Wed, 20 Feb 2013 08:28:47 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 20AC9AC6 for ; Wed, 20 Feb 2013 08:28:46 +0000 (UTC) Received: from server.rulingia.com (c220-239-237-213.belrs5.nsw.optusnet.com.au [220.239.237.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id r1K8SX8C070885 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 20 Feb 2013 19:28:33 +1100 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id r1K8SSHv003583 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 20 Feb 2013 19:28:28 +1100 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id r1K8SSBc003581; Wed, 20 Feb 2013 19:28:28 +1100 (EST) (envelope-from peter) Date: Wed, 20 Feb 2013 19:28:28 +1100 From: Peter Jeremy To: Kevin Day Subject: Re: Improving ZFS performance for large directories Message-ID: <20130220082828.GA44920@server.rulingia.com> References: <19DB8F4A-6788-44F6-9A2C-E01DEA01BED9@dragondata.com> <20130201192416.GA76461@server.rulingia.com> <19E0C908-79F1-43F8-899C-6B60F998D4A5@dragondata.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="1yeeQ81UyVL57Vl7" Content-Disposition: inline In-Reply-To: <19E0C908-79F1-43F8-899C-6B60F998D4A5@dragondata.com> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) Cc: FreeBSD Filesystems X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Feb 2013 08:28:47 -0000 --1yeeQ81UyVL57Vl7 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2013-Feb-19 14:10:47 -0600, Kevin Day wrote: >Timing doing an "ls" in large directories 20 times, the first is the >slowest, then all subsequent listings are roughly the same. OK. My testing was on large files rather than large amounts of metadata. >Thinking I'd make the primary cache metadata only, and the secondary >cache "all" would improve things, This won't work as expected. L2ARC only caches data coming out of ARC so by setting ARC to cache metadata only, there's never any "data" in ARC and hence never any evicted from ARC to L2ARC. > I wiped the device (SATA secure erase to make sure) That's not necessary. L2ARC doesn't survive reboots because all teh L2ARC "metadata" is in ARC only. This does mean that it takes quite a while for L2ARC to warm up following a reboot. >Before adding the SSD, an "ls" in a directory with 65k files would >take 10-30 seconds, it's now down to about 0.2 seconds. That sounds quite good. > There are roughly 29M files, growing at about 50k files/day. We >recently upgraded, and are now at 96 3TB drives in the pool.=20 That number of files isn't really excessive but it sounds like your workload has very low locality. At this stage, my suggestions are: 1) Disable atime if you don't need it & haven't already. Otherwise file accesses are triggering metadata updates. 2) Increase vfs.zfs.arc_meta_limit You're still getting more metadata misses than data misses 3) Increase your ARC size (more RAM) Your pool is quite large compared to your RAM. >It's a 250G drive, and only 22G is being used, and there's still a >~66% miss rate. That's 66% of the requests that missed in ARC. > Is there any way to tell why more metadata isn't >being pushed to the L2ARC? ZFS treats writing to L2ARC very much as an afterthought. L2ARC writes are rate limited by vfs.zfs.l2arc_write_{boost,max} and will be aborted if they might interfere with a read. I'm not sure how to improve it. Since this is all generic ZFS, you might like to try asking on zfs@lists.illumos.org as well. Some of the experts there might have some ideas. --=20 Peter Jeremy --1yeeQ81UyVL57Vl7 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlEkiSwACgkQ/opHv/APuIdsKQCgq90SUs/wm9rYE5moVPpIXBHu PCcAn38hMTi+YFknk64N3ro4mR/dSKsk =Sl9j -----END PGP SIGNATURE----- --1yeeQ81UyVL57Vl7--