Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 3 Sep 1998 02:47:02 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        mike@smith.net.au (Mike Smith)
Cc:        tlambert@primenet.com, cmascott@world.std.com, hackers@FreeBSD.ORG
Subject:   Re: Reading/writing /usr/ports is VERY slow
Message-ID:  <199809030247.TAA08033@usr07.primenet.com>
In-Reply-To: <199809021848.SAA01527@dingo.cdrom.com> from "Mike Smith" at Sep 2, 98 06:48:21 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> Could you explain how the proposed change would do this?  It's no worse
> than the current case if there is one cg with substantially fewer
> directories than any other.  In fact, you could simplify the code
> somewhat just by lying about the number of directories in the cg
> referenced by the rotor.

[ ... ]

> > The original "free reserve" values were set to 10% for a reason; it
> > was a compromise between people who wanted to use every byte, and
> > the actual 15% value required for a "perfect hash".
> 
> The change I proposed honours the reserve.  Now I *know* you didn't 
> look at it. 8)

You aren't proposing to damage the free reserve, you are proposing to
damage the distribution of the block allocation hash.

A disk becomes "fragmented" when you have hash collisions in the
block allocation hash.


> > In effect, when the FS picks a block (or, more correctly, a cluster),
> > it is hashing the filespace onto the disk.
> 
> Unfortunately, in the case of directories this results in them being 
> scattered all over the disk.  If you're creating a pile of them all at 
> once in a hierarchy, the net result is very poor locality of reference.

Yes.  Agreed.  But the locality is *intentionally* horizontal, by
design.


> Associating locality and time of creation is not a wonderful algorithm, 
> I freely admit.  However, I think that it has some merit over the 
> current approach (forcibly minimise locality under some circumstances).

I think changing the ports to populate directory hierarchies breadth-first
would result in better performance for this particular use.

The problem with preturbing the hash is that you will pessimize the
case where you are creating files in a single directory over time.

In the general case, you tend to use all files in a given directory
at a time, regardless of their creation date.


Maybe we should note the way that directories do block I/O is not
the same as the way files do block I/O.  According to the comments,
this behaviour is an intentional "play it safe" move on the part of
the author.


> If you have the stuff set up to test, I'd really be interested to see 
> if the clustering I proposed demonstrated any effects.

I don't have the equipment at this point to be able to do a reasonable
job of testing.  Certainly nothing I'd hang the outcome of an "is this
a good idea?" question... 8-(.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199809030247.TAA08033>