From owner-freebsd-fs@FreeBSD.ORG Tue May 29 08:06:35 2012 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 79CA6106566C; Tue, 29 May 2012 08:06:35 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (gw.catspoiler.org [75.1.14.242]) by mx1.freebsd.org (Postfix) with ESMTP id 56B138FC18; Tue, 29 May 2012 08:06:35 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id q4T86K8M007099; Tue, 29 May 2012 01:06:24 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201205290806.q4T86K8M007099@gw.catspoiler.org> Date: Tue, 29 May 2012 01:06:20 -0700 (PDT) From: Don Lewis To: brde@optusnet.com.au In-Reply-To: <20120529161802.N975@besplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: freebsd-fs@FreeBSD.org, dougb@FreeBSD.org Subject: Re: Millions of small files: best filesystem / best options X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 May 2012 08:06:35 -0000 On 29 May, Bruce Evans wrote: > On Mon, 28 May 2012, Doug Barton wrote: > >> On 5/28/2012 10:01 AM, Alessio Focardi wrote: >>> So in my case I would have to use -b 4096 -f 512 >>> >>> It's an improvement, but still is not ideal: still a big waste with 200 bytes files! >> >> Are all of the files exactly 200 bytes? If so that's likely the best you >> can do. > > It is easy to do better by using a file system that supports small block > sizes. This might be slow, but it reduces the wastage. Possible file > systems: > - it is easy to fix ffs to support a minimum block size of 512 (by > reducing its gratuitous limit of MINBSIZE and fixing the few things > that break: That shouldn't be necessary, especially if you newfs with the "-o space" option to force the fragments for multiple files to be allocated out of the same block right from the start unstead of waiting to do this once the filesystem starts getting full. I ran a Usenet server this way for quite a while with fairly good results, though the average file size was a bit bigger, about 2K or so. I found that if I didn't use "-o space" that space optimization wouldn't kick in soon enough and I'd tend to run out of full blocks that would be needed for larger files. The biggest performance problem that I ran into was that as the directories shrank and grew, they would tend to get badly fragmented, causing lookups to get slow. This was in the days before dirhash ...