From owner-freebsd-hackers Fri May 25 15:30:53 2001 Delivered-To: freebsd-hackers@freebsd.org Received: from earth.backplane.com (earth-nat-cw.backplane.com [208.161.114.67]) by hub.freebsd.org (Postfix) with ESMTP id 51A8E37B422 for ; Fri, 25 May 2001 15:30:51 -0700 (PDT) (envelope-from dillon@earth.backplane.com) Received: (from dillon@localhost) by earth.backplane.com (8.11.3/8.11.2) id f4PMUel44295; Fri, 25 May 2001 15:30:40 -0700 (PDT) (envelope-from dillon) Date: Fri, 25 May 2001 15:30:40 -0700 (PDT) From: Matt Dillon Message-Id: <200105252230.f4PMUel44295@earth.backplane.com> To: Greg Black Cc: hackers@FreeBSD.ORG Subject: Re: technical comparison References: <200105251718.VAA06296@aaz.links.ru> Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG One word: B+Tree. Hash tables work well if the entire hash table fits into memory and you know (approximately) what the upper limit on records is going to be. If you don't, then a B+Tree is the only proven way to go. (sure, there are plenty of other schemes, some hybrid, some completely different, but B+Tree's have been long proven so unless you want to experiment, just use one). In general I agree that UFS's only major pitfall is the sequential directory scanning. The reality, though, is that very few programs actually need to create thousands or millions of files in a single directory. The biggest one used to be USENET news but that has shifted into multi-article files and isn't an issue any more. Now the biggest one is probably squid. Databases are big storage-wise, but don't usually require lots of files. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message