Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 9 Sep 1995 13:45:01 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        lashley@netcom.com
Cc:        freebsd-questions@freebsd.org
Subject:   Re: Directory sizes
Message-ID:  <199509092045.NAA13496@phaeton.artisoft.com>
In-Reply-To: <9509091811.AA04049@asimov.volant.org> from "patl@asimov.volant.org" at Sep 9, 95 11:11:15 am

next in thread | previous in thread | raw e-mail | index | archive | help
> If I have a large number of files, and I want efficient access,
> are there any good rules-of-thumb for adjusting depth-vs-breadth
> in the filesystem?  In other words, (approximately) how many entries
> can a directory hold before it is more efficient to go deeper and
> add another directory lookup?  Assume that files being created or
> accessed will be randomly placed in the hierarchy.

There are no "good rules of thumb" other than "don't put everything in
one directory".

The tradeoff is between path component traversal and directory block
traversal.  The first requires a traversal of the base directory for
the component, meaning faulting an average of 1/2 the directory blocks
in a directory, then faulting an inode for the next directory.

The second is faulting half the blocks in the base directory, only
the base directory is now larger.

If the directory would be larger by exp2(log2(n+1)-1)+1 entries, then
it would be less expensive to have a subdirectory.  'n' is the number
of entries in a single directory block, divided by 4: a directory block
is 512 bytes (for now) and an inode is 128 bytes (for now).
For the purposes of lookup, faulting a directory block and faulting an
inode are equivalent: each are as likely to be in cache based on locality
if the create was done all at once.  If your average file name length
is (16 + n) * 4 = 512, or 112 characters, it's exactly equal.

If you don't have a linear distribution of inodes in the path heirarchy,
well, it's the same odds as you picking a number between 1 and 10,
your friend picking a number between 1 and 10, and if your number is
less than your friends number, then make a subdirectory.  Assuming
you are looking files up relative to the current directory, and not
by absolute path.  8-).

This is a silly thing to try to optimize.  The expense will be in faulting
blocks to/from the file, not the directory.  No change in layout will
modify your read/write path for files, which is the important speed limit.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199509092045.NAA13496>