From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 20:56:24 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4DAD11065680; Thu, 21 Jul 2011 20:56:24 +0000 (UTC) (envelope-from mm@FreeBSD.org) Received: from mail.vx.sk (mail.vx.sk [IPv6:2a01:4f8:100:1043::3]) by mx1.freebsd.org (Postfix) with ESMTP id 0E5DC8FC17; Thu, 21 Jul 2011 20:56:24 +0000 (UTC) Received: from core.vx.sk (localhost [127.0.0.1]) by mail.vx.sk (Postfix) with ESMTP id 08E0215E020; Thu, 21 Jul 2011 22:56:23 +0200 (CEST) X-Virus-Scanned: amavisd-new at mail.vx.sk Received: from mail.vx.sk ([127.0.0.1]) by core.vx.sk (mail.vx.sk [127.0.0.1]) (amavisd-new, port 10024) with LMTP id FaRNsgMZ72U2; Thu, 21 Jul 2011 22:56:20 +0200 (CEST) Received: from [10.9.8.3] (chello085216231078.chello.sk [85.216.231.78]) by mail.vx.sk (Postfix) with ESMTPSA id B6DDA15E012; Thu, 21 Jul 2011 22:56:20 +0200 (CEST) Message-ID: <4E289284.5020800@FreeBSD.org> Date: Thu, 21 Jul 2011 22:56:36 +0200 From: Martin Matuska User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: Ivan Voras References: <4E286F1F.6010502@FreeBSD.org> In-Reply-To: X-Enigmail-Version: 1.2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 20:56:24 -0000 On 21. 7. 2011 21:40, Ivan Voras wrote: > Thank you very much - now if only you took as much effort to explain > the possible connection between your quote and my post as it took you > to find the quote :) > > As others explained, ZFS definitely does not use fixed block sizes I agree to that. I tried some more digging and stomped on this opensolaris mailing list thread: It starts here: http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg35150.html With an interesting user report here (nice summary): http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg35189.html Most times they blame the way the client utilities work with directories (sorting etc.). Now to some more relevant options: It is also possible to do metadata-only caching for a dataset: zfs set primarycache=metadata L2 can be modified as well: zfs set secondarycache=metadata If I find some time I can run some simulations on this to see how it performs compared to primarycache=all. The vdev read-ahead cache might also have a negative impact here (lots of wasted IOPS), mostly if the blocks are spread around the vdev. We have followed what Illumos did and vdev cache is now disabled by default. I have updated the zfs-stats tool (ports: sysutils/zfs-stats) with latest Jason J. Hellenthal's arc_summary.pl, it gives a good overview of ZFS sysctl's: https://github.com/mmatuska/zfs-stats -- Martin Matuska FreeBSD committer http://blog.vx.sk