Date: Sat, 9 Sep 1995 13:45:01 -0700 (MST) From: Terry Lambert <terry@lambert.org> To: lashley@netcom.com Cc: freebsd-questions@freebsd.org Subject: Re: Directory sizes Message-ID: <199509092045.NAA13496@phaeton.artisoft.com> In-Reply-To: <9509091811.AA04049@asimov.volant.org> from "patl@asimov.volant.org" at Sep 9, 95 11:11:15 am
next in thread | previous in thread | raw e-mail | index | archive | help
> If I have a large number of files, and I want efficient access, > are there any good rules-of-thumb for adjusting depth-vs-breadth > in the filesystem? In other words, (approximately) how many entries > can a directory hold before it is more efficient to go deeper and > add another directory lookup? Assume that files being created or > accessed will be randomly placed in the hierarchy. There are no "good rules of thumb" other than "don't put everything in one directory". The tradeoff is between path component traversal and directory block traversal. The first requires a traversal of the base directory for the component, meaning faulting an average of 1/2 the directory blocks in a directory, then faulting an inode for the next directory. The second is faulting half the blocks in the base directory, only the base directory is now larger. If the directory would be larger by exp2(log2(n+1)-1)+1 entries, then it would be less expensive to have a subdirectory. 'n' is the number of entries in a single directory block, divided by 4: a directory block is 512 bytes (for now) and an inode is 128 bytes (for now). For the purposes of lookup, faulting a directory block and faulting an inode are equivalent: each are as likely to be in cache based on locality if the create was done all at once. If your average file name length is (16 + n) * 4 = 512, or 112 characters, it's exactly equal. If you don't have a linear distribution of inodes in the path heirarchy, well, it's the same odds as you picking a number between 1 and 10, your friend picking a number between 1 and 10, and if your number is less than your friends number, then make a subdirectory. Assuming you are looking files up relative to the current directory, and not by absolute path. 8-). This is a silly thing to try to optimize. The expense will be in faulting blocks to/from the file, not the directory. No change in layout will modify your read/write path for files, which is the important speed limit. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199509092045.NAA13496>