From owner-freebsd-hackers Sun May 27 18:51:36 2001 Delivered-To: freebsd-hackers@freebsd.org Received: from netbank.com.br (garrincha.netbank.com.br [200.203.199.88]) by hub.freebsd.org (Postfix) with ESMTP id 6D79B37B423 for ; Sun, 27 May 2001 18:51:30 -0700 (PDT) (envelope-from riel@conectiva.com.br) Received: from surriel.ddts.net (1-250.cwb-adsl.brasiltelecom.net.br [200.193.160.250]) by netbank.com.br (Postfix) with ESMTP id EA36F46803; Sun, 27 May 2001 22:50:07 -0300 (BRST) Received: from localhost (quwkdz@localhost [127.0.0.1]) by surriel.ddts.net (8.11.3/8.11.2) with ESMTP id f4S1onP21783; Sun, 27 May 2001 22:51:05 -0300 Date: Sun, 27 May 2001 22:50:48 -0300 (BRST) From: Rik van Riel X-Sender: riel@imladris.rielhome.conectiva To: Peter Wemm Cc: Andrew Reilly , hackers@FreeBSD.ORG Subject: Re: technical comparison In-Reply-To: <20010526224229.B4B97380E@overcee.netplex.com.au> Message-ID: X-spambait: aardvark@kernelnewbies.org X-spammeplease: aardvark@nl.linux.org MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Sat, 26 May 2001, Peter Wemm wrote: > Which is more expensive? Maintaining an on-disk hashed (or b+tree) > directory format for *everything* or maintaining a simple low-cost > format on disk with in-memory hashing for fast lookups? I bet that for modest directory sizes the cost of disk IO outweighs the added CPU usage by so much that you may as well take the trouble of using the more scalable directory format. > For the small directory case I suspect the FFS+namecache way is more > cost effective. For the medium to large directory case (10,000 to > 100,000 entries), I suspect the FFS+namecache method isn't too shabby, > providing you are not starved for memory. For the insanely large > cases - I dont want to think about :-). The ext2 fs, which uses roughly the same directory structure as UFS and has a name cache which isn't limited in size, seems to bog down at about 10,000 directory entries. Daniel Phillips is working on a hash extension to ext2; not a replacement of the directory format, but a way to tack a hashed index after the normal directory index. This way the filesystem is backward compatible, older kernels will just use the old directory format and will clear a flag when they write to the directory, this can later be used by the new kernel to rebuild the hashed directory index. It also has the advantage of being able to keep using the tried&tested fsck utilities. Maybe this could be an idea to enhance UFS scalability for huge directories without endangering reliability ? regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://distro.conectiva.com/ Send all your spam to aardvark@nl.linux.org (spam digging piggy) To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message