From owner-freebsd-hackers  Fri May 25 10:18:54 2001
Delivered-To: freebsd-hackers@freebsd.org
Received: from aaz.links.ru (aaz.links.ru [193.125.152.37])
	by hub.freebsd.org (Postfix) with ESMTP id C375737B423
	for <hackers@FreeBSD.ORG>; Fri, 25 May 2001 10:18:45 -0700 (PDT)
	(envelope-from babolo@links.ru)
Received: (from babolo@localhost)
	by aaz.links.ru (8.9.3/8.9.3) id VAA06296;
	Fri, 25 May 2001 21:18:06 +0400 (MSD)
Message-Id: <200105251718.VAA06296@aaz.links.ru>
Subject: Re: technical comparison
In-Reply-To: <nospam-990735453.93235@maxim.gbch.net> from "Greg Black" at "May 25, 1 06:17:33 am"
To: gjb@gbch.net (Greg Black)
Date: Fri, 25 May 2001 21:18:05 +0400 (MSD)
Cc: jandrese@mitre.org, float@firedrake.org, hackers@FreeBSD.ORG
From: .@babolo.ru
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-hackers.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo?subject=subscribe%20freebsd-hackers>
List-Unsubscribe: <mailto:majordomo?subject=unsubscribe%20freebsd-hackers>
X-Loop: FreeBSD.ORG

Greg Black writes:
> "Andresen,Jason R." wrote:
> 
> | On Thu, 24 May 2001, void wrote:
> | 
> | > On Wed, May 23, 2001 at 09:20:51AM -0400, Andresen,Jason R. wrote:
> | > >
> | > > Why is knowing the file names cheating?  It is almost certain
> | > > that the application will know the names of it's own files
> | > > (and won't be grepping the entire directory every time it
> | > > needs to find a file).
> | >
> | > With 60,000 files, that would have the application duplicating
> | > 60,000 pieces of information that are stored by the operating system.
> | > Operations like open() and unlink() still have to search the directory
> | > to get the inode, so there isn't much incentive for an application to
> | > do that, I think.
> | 
> | This still doesn't make sense to me.  It's not like the program is going
> | to want to do a "find" on the directory every time it has some data it
> | wants to put somewhere.  I think for the majority of the cases (I'm sure
> | there are exceptions) an application program that wants to interact with
> | files will know what filename it wants ahead of time.  This doesn't
> | necessarily mean storing 60,000 filenames either, it could be something
> | like:
> | I have files fooX where X is a number from 00000 to 60000 in that
> | directory.  I need to find a piece of information, so I run that
> | information through a hash of some sort and determine that the file I want
> | is number 23429, so I open that file.
> 
> And if this imaginary program is going to do that, it's equally
> easy to use a multilevel directory structure and that will make
> the life of all users of the system simpler.  There's no real
> excuse for directories with millions (or even thousands) of
> files.
There is.
You assume that names are random.
Assume that they  are not.
VERY old example:
a
aa
...
aaaaaaa...aaa 255 times
aaaaaaa...aab
so on.
Yes, I know: hash.

Is it practical to this in every application
(sometimes it is unknown before practical use
if directories become big) instead in
one file system?

Sorry for a bad English.

-- 
@BABOLO      http://links.ru/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message