From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 18:15:52 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5C7D1106564A for ; Thu, 21 Jul 2011 18:15:52 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-ww0-f50.google.com (mail-ww0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id BC0148FC1C for ; Thu, 21 Jul 2011 18:15:51 +0000 (UTC) Received: by wwe6 with SMTP id 6so1482282wwe.31 for ; Thu, 21 Jul 2011 11:15:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=IbOC08m+y2hdxAa+gVdRGTA0r6CULEICm7Owwlfw6w0=; b=Ic9r9LyC3AwZ2BRonxU+zejhITNULaAy3RkWNurarQ0YfYvS1jxy/XjZhkMXHOdsD3 4FneMX/Y0811YUqy11VrGHt6eI28Uvvr0BnqL4ej8b5l7CgSDUFY/Z4yzXbQiphpqLN7 GjofYKRBc0uv59DQZ7EI2HBxOt8J5y4OxUUWA= MIME-Version: 1.0 Received: by 10.216.61.198 with SMTP id w48mr1057479wec.40.1311272150465; Thu, 21 Jul 2011 11:15:50 -0700 (PDT) Sender: artemb@gmail.com Received: by 10.216.46.18 with HTTP; Thu, 21 Jul 2011 11:15:50 -0700 (PDT) In-Reply-To: References: Date: Thu, 21 Jul 2011 11:15:50 -0700 X-Google-Sender-Auth: uF_yU18ClNLbaK62epFGPiz85yc Message-ID: From: Artem Belevich To: Ivan Voras Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 18:15:52 -0000 On Thu, Jul 21, 2011 at 9:38 AM, Ivan Voras wrote: > On 21 July 2011 17:50, Freddie Cash wrote: >> On Thu, Jul 21, 2011 at 8:45 AM, Ivan Voras wrote: >>> >>> Is there an equivalent of UFS dirhash memory setting for ZFS? (i.e. the >>> size of the metadata cache) >> >> vfs.zfs.arc_meta_limit >> >> This sets the amount of ARC that can be used for metadata.=A0 The defaul= t is >> 1/8th of ARC, I believe.=A0 This setting lets you use "primarycache=3Dal= l" >> (store metadata and file data in ARC) but then tune how much is used for >> each. >> >> Not sure if that will help in your case or not, but it's a sysctl you ca= n >> play with. > > I don't think that it works, or at least is not as efficient as dirhash: > > www:~> sysctl -a | grep meta > kern.metadelay: 28 > vfs.zfs.mfu_ghost_metadata_lsize: 129082368 > vfs.zfs.mfu_metadata_lsize: 116224 > vfs.zfs.mru_ghost_metadata_lsize: 113958912 > vfs.zfs.mru_metadata_lsize: 16384 > vfs.zfs.anon_metadata_lsize: 0 > vfs.zfs.arc_meta_limit: 322412800 > vfs.zfs.arc_meta_used: 506907792 > kstat.zfs.misc.arcstats.demand_metadata_hits: 4471705 > kstat.zfs.misc.arcstats.demand_metadata_misses: 2110328 > kstat.zfs.misc.arcstats.prefetch_metadata_hits: 27 > kstat.zfs.misc.arcstats.prefetch_metadata_misses: 51 > > arc_meta_used is nearly 500 MB which should be enough even in this > case. With filenames of 32 characters, all the filenames alone for > 130,000 files in a directory take about 4 MB - I doubt the ZFS > introduces so much extra metadata it doesn't fit in 500 MB. For what it's worth, 500K files in one directory seems to work reasonably well on my box running few weeks old 8-stable (quad core 8GB RAM, ~6GB ARC), ZFSv28 pool on a 2-drive mirror + 50GB L2ARC. $ time perl -e 'use Fcntl; for $f (1..500000) {sysopen(FH,"f$f",O_CREAT); close(FH);}' perl -e >| /dev/null 2.26s user 39.17s system 96% cpu 43.156 total $ time find . |wc -l 500001 find . 0.16s user 0.33s system 99% cpu 0.494 total $ time find . -ls |wc -l 500001 find . -ls 1.93s user 12.13s system 96% cpu 14.643 total time find . |xargs -n 100 rm find . 0.22s user 0.28s system 0% cpu 2:45.12 total xargs -n 100 rm 1.25s user 58.51s system 36% cpu 2:45.61 total Deleting files resulted in a constant stream of writes to hard drives. I guess file deletion may end up up being a synchronous write committed to ZIL right away. If that's indeed the case, small slog on SSD could probably speed up file deletion a bit. --Artem > > I am now deleting the session files, and I hope it will not take days > to complete... > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >