Date: Tue, 29 May 2012 00:59:46 -0700 From: Bakul Shah <bakul@bitblocks.com> To: Bruce Evans <brde@optusnet.com.au> Cc: freebsd-fs@FreeBSD.org Subject: Re: Millions of small files: best filesystem / best options Message-ID: <20120529075946.C4ADFB82A@mail.bitblocks.com> In-Reply-To: Your message of "Tue, 29 May 2012 17:35:18 %2B1000." <20120529161802.N975@besplex.bde.org> References: <1490568508.7110.1338224468089.JavaMail.root@zimbra.interconnessioni.it> <4FC457F7.9000800@FreeBSD.org> <20120529161802.N975@besplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 29 May 2012 17:35:18 +1000 Bruce Evans <brde@optusnet.com.au> wrote: > > But I expect using a file system would be so slow for lots of really > small files that I wouldn't try it. Caching is already poor for > 4K-files, and a factor of 20 loss won't improve it. If you don't want > to use a database, maybe you can use tar.[gz] files. These at least > reduce the wastage (but still waste about twice as much as msdosfs with > 512 byte blocks), unless they are compressed. I think there are ways > to treat tar files as file systems and to avoid reading the whole file > to find files in it (zip format is better for this). As someone else pointed out, the right thing for Alessio may be to just use fusefs-sqlfs or may be even roll his own! Metadata can be generated on the fly. If performance is an issue he can slurp in the whole file and use write-through for any updates. A million 200 bytes files would take less than 512MB. Another alternative: 9pfuse (from plan9ports). There is even an sqfs written in 339 lines of python on github that'd bolt right on 9pfuse! He can use it as a template to build exactly what he wants. There is also tarfs etc. in plan9ports but it provides readonly support.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120529075946.C4ADFB82A>