Date: Mon, 31 Mar 1997 11:32:24 -0700 (MST) From: Terry Lambert <terry@lambert.org> To: scrappy@hub.org (The Hermit Hacker) Cc: hackers@freebsd.org Subject: Re: ftruncate("directory")... Message-ID: <199703311832.LAA09800@phaeton.artisoft.com> In-Reply-To: <Pine.NEB.3.96.970331000021.199S-100000@thelab.hub.org> from "The Hermit Hacker" at Mar 31, 97 00:03:27 am
next in thread | previous in thread | raw e-mail | index | archive | help
> Has anyone written (is it possible?) a utility that can > 'truncate' a directory? > > Essentially, what I'm looking at is something that would > open a directory, re-org the entries in it so that only the 'good' > entries are at the head of the file, and then truncate it... > > Not sure if this is actually possible, but am going to play > with it here...but if its already been done, all the better... :) Directories are operated on in terms of "directory blocks". New entries are allocated from as early in the directory as it is possible to allocate them. If there are two entries with a gap between them, and there would be enough space in the block for the new entry if the gap wasn't there, then entries are moved. This is called "compaction", and only occurs on entry creation, and only inside a block, never spanning a block boundry. When a block is fully empty (the last entry in the block has just been deleted), then if it is not the last block, it's left there. If, however, it was the last block, then all contiguous empty blocks prior to the end of the directory are truncated back. It should be possible to modify the algorithm so that intermediate blocks which are fully empty are removed from the block list, either resulting in a sparse directory file, or (by reorganizing the block order) actually a smaller file by removing the placeholders. This is not recommended, since it would invlidate NFS cookies in such a way as to cause SunOS NFSv3 clients to fail unrecoverably in some situations where the clients expect traditional UFS behaviour and fail to deal gracefully with invalidated cookies (like they are supposed to, but do not). This is a bug in Sun's NFSv3 client code, and it is not likely that you will be able to get them to fix it. If you choose the sparse directory route, the ufs_dir.c operations must be made to recognize a zero "next" pointer for the first entry in a block as an indicator that the block contents are invalid and the next block should be consulted instead. It should be noted at this point that truncation on deletion is one of the reasons FreeBSD performs so poorly on the lmbench tests: the truncation affects FS metadata which is synchronously written unless you mount -async. There is considerable controversy as to whether or not lmbench tests on ext2fs on Linux (or any other -async mounted FS which fails to make POSIX guarantees, as -async mounted FS's are wont to do) can really be considered a "figure of merit". The general consensus among people with FS engineering backgrounds is that it can not. Inter-block compaction (as opposed to intra-block compaction) must include multiple non-atomic yet idempotent operations, since it will, by definition, span multiple blocks. It is highly unlikely that you will achive any significant block recovery by doing this, and the fact that you must implement the idempotence in muc the same way that "rename" implements it, means that this would be a very high-overhead operation: not something you would want to do during standard directory operations. For the amount of fragmentation necessary to achieve utility from this recovery, you would need to have set up a scenario terribly different from standard utilization patterns. Perhaps the lmbench directory create/delete test? Finally, it is unlikely that you will be able to successfully employ a "least fit" algorithm in allocating these entries without an O(N*(N-1)) traversal of the directory entries, noting that you must make specific exception for "." and "..", which *must* be the first entries in any directory. In any case, if you do pursue this, you should establish performance baselines for the operations which will be affected by the additional overhead, and one or more common scenarios where your changes will actually be beneficial instead of detrimental. Regards, Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199703311832.LAA09800>