From owner-freebsd-hackers Sun Oct 29 8:27:25 2000 Delivered-To: freebsd-hackers@freebsd.org Received: from earth.backplane.com (placeholder-dcat-1076843399.broadbandoffice.net [64.47.83.135]) by hub.freebsd.org (Postfix) with ESMTP id DD1C637B479 for ; Sun, 29 Oct 2000 08:27:22 -0800 (PST) Received: (from dillon@localhost) by earth.backplane.com (8.11.1/8.9.3) id e9TGRMP70398; Sun, 29 Oct 2000 08:27:22 -0800 (PST) (envelope-from dillon) Date: Sun, 29 Oct 2000 08:27:22 -0800 (PST) From: Matt Dillon Message-Id: <200010291627.e9TGRMP70398@earth.backplane.com> To: Ryan Thompson , freebsd-hackers@FreeBSD.ORG Subject: Re: Filesystem holes References: Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG :> Hi all... :> :> One the tasks that I have undertaken lately is to improve the efficiency :> of a couple of storage facilities we use internally, here. Basically, :> they are moderate-size tables currently implemented in SQL, which is OK in :> terms of performance, but the hash function is breaking down and the :> storage is rather inefficient for our table of about 2,850,000 members :> (~2.1 GB total storage). There are 64M possible hash values in our :> current implementation, and our record size is variable, but could be :> safely fixed at about 1.5KB... So, total storage if all values were used :> would be about 96GB. (See where I'm going with this?) :> :> One of the options I am considering is actually using address calculation, :> and taking advantage of filesystem holes, to keep storage down to what is :> actually being used, while providing instant lookups. :> :> The single file would be about 96G addressable bytes... But the actual :> block count would be much lower. I suppose I will have to create a series :> of these files and divide the problem into < 4GB chunks, but one :> lookup/store will still be done with one calculation and one disk seek :> (just more filehandles open). :> :> Deletes seem problematic. My question is, does the operating system :> provide any way to free blocks used in the middle of a file? :> :> Must I periodically re-create these files (long and slow process, but not :> so bad if done infrequently) to reclaim space, or is there a way to free :> arbitrary blocks in a file in userland (WITHOUT trashing the filesystem? :> :-) :> :> - Ryan :> :> -- :> Ryan Thompson :> Network Administrator, Accounts :> Phone: +1 (306) 664-1161 :> :> SaskNow Technologies http://www.sasknow.com :> #106-380 3120 8th St E Saskatoon, SK S7H 0W2 I would strongly recommend using a B+Tree instead of a hash table. With a hash table slightly different lookups will seek all over the place. With a B+Tree lookups will stay more localized. For example, if you insert a (nearly) sorted dictionary of words into an SQL table with a B+Tree, the memory working set required to insert efficiently stays constant whether you are inserting a thousand, a million, or a billion records. That is, the memory requirement is effecitvely O(LogN) for a disk requirement of O(N). With a hash table, the memory working set required to insert efficiently is approximately O(N) for a disk requirement of O(N)... much much worse. A B+Tree will also scale with the size of the dataset being managed, so you do not have to preallocate or prereserve file space. We are using an in-house SQL database for our product (which I can't really talk about right now) and, using B+Tree's, I can consistently insert 300 records/sec on a cheap desktop PC (which I use for testing) with 64MB of ram (remember, insert requires an internal select to check for key conflicts), even when the test database grows to several gigabytes. In anycase, a B+Tree is the way to go. Hash tables are useful only in very, very, very specialized circumstances. In all other circumstances they are no better then a pain in the rear. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message