Date: Thu, 3 Sep 1998 02:47:02 +0000 (GMT) From: Terry Lambert <tlambert@primenet.com> To: mike@smith.net.au (Mike Smith) Cc: tlambert@primenet.com, cmascott@world.std.com, hackers@FreeBSD.ORG Subject: Re: Reading/writing /usr/ports is VERY slow Message-ID: <199809030247.TAA08033@usr07.primenet.com> In-Reply-To: <199809021848.SAA01527@dingo.cdrom.com> from "Mike Smith" at Sep 2, 98 06:48:21 pm
next in thread | previous in thread | raw e-mail | index | archive | help
> Could you explain how the proposed change would do this? It's no worse > than the current case if there is one cg with substantially fewer > directories than any other. In fact, you could simplify the code > somewhat just by lying about the number of directories in the cg > referenced by the rotor. [ ... ] > > The original "free reserve" values were set to 10% for a reason; it > > was a compromise between people who wanted to use every byte, and > > the actual 15% value required for a "perfect hash". > > The change I proposed honours the reserve. Now I *know* you didn't > look at it. 8) You aren't proposing to damage the free reserve, you are proposing to damage the distribution of the block allocation hash. A disk becomes "fragmented" when you have hash collisions in the block allocation hash. > > In effect, when the FS picks a block (or, more correctly, a cluster), > > it is hashing the filespace onto the disk. > > Unfortunately, in the case of directories this results in them being > scattered all over the disk. If you're creating a pile of them all at > once in a hierarchy, the net result is very poor locality of reference. Yes. Agreed. But the locality is *intentionally* horizontal, by design. > Associating locality and time of creation is not a wonderful algorithm, > I freely admit. However, I think that it has some merit over the > current approach (forcibly minimise locality under some circumstances). I think changing the ports to populate directory hierarchies breadth-first would result in better performance for this particular use. The problem with preturbing the hash is that you will pessimize the case where you are creating files in a single directory over time. In the general case, you tend to use all files in a given directory at a time, regardless of their creation date. Maybe we should note the way that directories do block I/O is not the same as the way files do block I/O. According to the comments, this behaviour is an intentional "play it safe" move on the part of the author. > If you have the stuff set up to test, I'd really be interested to see > if the clustering I proposed demonstrated any effects. I don't have the equipment at this point to be able to do a reasonable job of testing. Certainly nothing I'd hang the outcome of an "is this a good idea?" question... 8-(. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199809030247.TAA08033>