Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 25 May 2001 21:18:05 +0400 (MSD)
From:      .@babolo.ru
To:        gjb@gbch.net (Greg Black)
Cc:        jandrese@mitre.org, float@firedrake.org, hackers@FreeBSD.ORG
Subject:   Re: technical comparison
Message-ID:  <200105251718.VAA06296@aaz.links.ru>
In-Reply-To: <nospam-990735453.93235@maxim.gbch.net> from "Greg Black" at "May 25, 1 06:17:33 am"

next in thread | previous in thread | raw e-mail | index | archive | help
Greg Black writes:
> "Andresen,Jason R." wrote:
> 
> | On Thu, 24 May 2001, void wrote:
> | 
> | > On Wed, May 23, 2001 at 09:20:51AM -0400, Andresen,Jason R. wrote:
> | > >
> | > > Why is knowing the file names cheating?  It is almost certain
> | > > that the application will know the names of it's own files
> | > > (and won't be grepping the entire directory every time it
> | > > needs to find a file).
> | >
> | > With 60,000 files, that would have the application duplicating
> | > 60,000 pieces of information that are stored by the operating system.
> | > Operations like open() and unlink() still have to search the directory
> | > to get the inode, so there isn't much incentive for an application to
> | > do that, I think.
> | 
> | This still doesn't make sense to me.  It's not like the program is going
> | to want to do a "find" on the directory every time it has some data it
> | wants to put somewhere.  I think for the majority of the cases (I'm sure
> | there are exceptions) an application program that wants to interact with
> | files will know what filename it wants ahead of time.  This doesn't
> | necessarily mean storing 60,000 filenames either, it could be something
> | like:
> | I have files fooX where X is a number from 00000 to 60000 in that
> | directory.  I need to find a piece of information, so I run that
> | information through a hash of some sort and determine that the file I want
> | is number 23429, so I open that file.
> 
> And if this imaginary program is going to do that, it's equally
> easy to use a multilevel directory structure and that will make
> the life of all users of the system simpler.  There's no real
> excuse for directories with millions (or even thousands) of
> files.
There is.
You assume that names are random.
Assume that they  are not.
VERY old example:
a
aa
...
aaaaaaa...aaa 255 times
aaaaaaa...aab
so on.
Yes, I know: hash.

Is it practical to this in every application
(sometimes it is unknown before practical use
if directories become big) instead in
one file system?

Sorry for a bad English.

-- 
@BABOLO      http://links.ru/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200105251718.VAA06296>