From owner-freebsd-hackers Wed Sep 2 19:48:27 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id TAA28325 for freebsd-hackers-outgoing; Wed, 2 Sep 1998 19:48:27 -0700 (PDT) (envelope-from owner-freebsd-hackers@FreeBSD.ORG) Received: from smtp01.primenet.com (smtp01.primenet.com [206.165.6.131]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id TAA28309 for ; Wed, 2 Sep 1998 19:48:21 -0700 (PDT) (envelope-from tlambert@usr07.primenet.com) Received: (from daemon@localhost) by smtp01.primenet.com (8.8.8/8.8.8) id TAA23436; Wed, 2 Sep 1998 19:47:16 -0700 (MST) Received: from usr07.primenet.com(206.165.6.207) via SMTP by smtp01.primenet.com, id smtpd023328; Wed Sep 2 19:47:07 1998 Received: (from tlambert@localhost) by usr07.primenet.com (8.8.5/8.8.5) id TAA08033; Wed, 2 Sep 1998 19:47:02 -0700 (MST) From: Terry Lambert Message-Id: <199809030247.TAA08033@usr07.primenet.com> Subject: Re: Reading/writing /usr/ports is VERY slow To: mike@smith.net.au (Mike Smith) Date: Thu, 3 Sep 1998 02:47:02 +0000 (GMT) Cc: tlambert@primenet.com, cmascott@world.std.com, hackers@FreeBSD.ORG In-Reply-To: <199809021848.SAA01527@dingo.cdrom.com> from "Mike Smith" at Sep 2, 98 06:48:21 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > Could you explain how the proposed change would do this? It's no worse > than the current case if there is one cg with substantially fewer > directories than any other. In fact, you could simplify the code > somewhat just by lying about the number of directories in the cg > referenced by the rotor. [ ... ] > > The original "free reserve" values were set to 10% for a reason; it > > was a compromise between people who wanted to use every byte, and > > the actual 15% value required for a "perfect hash". > > The change I proposed honours the reserve. Now I *know* you didn't > look at it. 8) You aren't proposing to damage the free reserve, you are proposing to damage the distribution of the block allocation hash. A disk becomes "fragmented" when you have hash collisions in the block allocation hash. > > In effect, when the FS picks a block (or, more correctly, a cluster), > > it is hashing the filespace onto the disk. > > Unfortunately, in the case of directories this results in them being > scattered all over the disk. If you're creating a pile of them all at > once in a hierarchy, the net result is very poor locality of reference. Yes. Agreed. But the locality is *intentionally* horizontal, by design. > Associating locality and time of creation is not a wonderful algorithm, > I freely admit. However, I think that it has some merit over the > current approach (forcibly minimise locality under some circumstances). I think changing the ports to populate directory hierarchies breadth-first would result in better performance for this particular use. The problem with preturbing the hash is that you will pessimize the case where you are creating files in a single directory over time. In the general case, you tend to use all files in a given directory at a time, regardless of their creation date. Maybe we should note the way that directories do block I/O is not the same as the way files do block I/O. According to the comments, this behaviour is an intentional "play it safe" move on the part of the author. > If you have the stuff set up to test, I'd really be interested to see > if the clustering I proposed demonstrated any effects. I don't have the equipment at this point to be able to do a reasonable job of testing. Certainly nothing I'd hang the outcome of an "is this a good idea?" question... 8-(. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message