Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 29 May 2012 00:59:46 -0700
From:      Bakul Shah <bakul@bitblocks.com>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        freebsd-fs@FreeBSD.org
Subject:   Re: Millions of small files: best filesystem / best options 
Message-ID:  <20120529075946.C4ADFB82A@mail.bitblocks.com>
In-Reply-To: Your message of "Tue, 29 May 2012 17:35:18 %2B1000." <20120529161802.N975@besplex.bde.org> 
References:  <1490568508.7110.1338224468089.JavaMail.root@zimbra.interconnessioni.it> <4FC457F7.9000800@FreeBSD.org> <20120529161802.N975@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 29 May 2012 17:35:18 +1000 Bruce Evans <brde@optusnet.com.au>  wrote:
> 
> But I expect using a file system would be so slow for lots of really
> small files that I wouldn't try it.  Caching is already poor for
> 4K-files, and a factor of 20 loss won't improve it.  If you don't want
> to use a database, maybe you can use tar.[gz] files.  These at least
> reduce the wastage (but still waste about twice as much as msdosfs with
> 512 byte blocks), unless they are compressed.  I think there are ways
> to treat tar files as file systems and to avoid reading the whole file
> to find files in it (zip format is better for this).

As someone else pointed out, the right thing for Alessio may
be to just use fusefs-sqlfs or may be even roll his own!
Metadata can be generated on the fly. If performance is an
issue he can slurp in the whole file and use write-through for
any updates. A million 200 bytes files would take less than
512MB.

Another alternative: 9pfuse (from plan9ports). There is even
an sqfs written in 339 lines of python on github that'd bolt
right on 9pfuse! He can use it as a template to build exactly
what he wants. There is also tarfs etc. in plan9ports but it
provides readonly support.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120529075946.C4ADFB82A>