Date: Wed, 02 Sep 1998 23:35:07 +0000 From: Mike Smith <mike@smith.net.au> To: Terry Lambert <tlambert@primenet.com> Cc: mike@smith.net.au (Mike Smith), cmascott@world.std.com, hackers@FreeBSD.ORG Subject: Re: Reading/writing /usr/ports is VERY slow Message-ID: <199809022335.XAA00706@word.smith.net.au> In-Reply-To: Your message of "Thu, 03 Sep 1998 02:35:32 GMT." <199809030235.TAA07404@usr07.primenet.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> > This is only useful if the layout has clustered related directories on
> > the disk. The current code does a good job of keeping them a long way
> > apart.
> >
> > > In fact, directory locality is loosely assureed.
> >
> > Almost the opposite is true, especially if you don't make a common
> > practice of deleting lots of directories.
>
> I think we are miscommunicating here.
>
> Directory *contents* are loosely assured locality.
>
> Directory *heirarchies* are loosely assured near-perfect random
> non-locality.
I seem to recall that's what I said last time.
> Per the "find" command argument, this issue is "breadth-first" vs.
> "depth-first".
You need three dimensions; breadth is distance across siblings, depth
is distance from parent through children, and width is individual
directory size.
Directories are associated width-first, not breadth-first or
depth-first. Both breadth and depth are pessimised.
> Either reimplement the directory code as a btree, or, better, "avoid
> writing code that results in diretory recursions".
The first has the effect of clustering directory data. Funnily enough,
that's what I was attempting to achieve.
> I think the ports problem in this respect is that the directory
> entry cache is too small for the traversal being done.
No, the ports problem is that all directories in the hierarchy will be
visited, and you have to seek a long way from any one entry to almost
any other entry.
> Basically, "some idiot is using the FS directory hierarchy as if it
> were a database index". 8-).
Feel free to suggest an alternative. Code complete, of course.
> The correct thing to do here would probably be to create a "fast find"
> index, and have find use this (SunOS 4.1 has this; you create the
> index at 3 am from cron -- 3am because that's when you break to go
> to Naugles for egg-and-bean burritos 8-)).
>
> Alternately, a "portfinder" port would help...
You're still only reading what you want to see. This sort of search is
already trivialised; we keep an index (ports/INDEX), and have both a
commandline ('make search key=') and graphical (sysutils/pib) search
tool.
With the ports collection, the issue is creation and backups. The same
problem hits other applications though (anything involving the CVS
repository, for example).
--
\\ Sometimes you're ahead, \\ Mike Smith
\\ sometimes you're behind. \\ mike@smith.net.au
\\ The race is long, and in the \\ msmith@freebsd.org
\\ end it's only with yourself. \\ msmith@cdrom.com
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199809022335.XAA00706>
