Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 02 Sep 1998 23:35:07 +0000
From:      Mike Smith <mike@smith.net.au>
To:        Terry Lambert <tlambert@primenet.com>
Cc:        mike@smith.net.au (Mike Smith), cmascott@world.std.com, hackers@FreeBSD.ORG
Subject:   Re: Reading/writing /usr/ports is VERY slow 
Message-ID:  <199809022335.XAA00706@word.smith.net.au>
In-Reply-To: Your message of "Thu, 03 Sep 1998 02:35:32 GMT." <199809030235.TAA07404@usr07.primenet.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
> > This is only useful if the layout has clustered related directories on 
> > the disk.  The current code does a good job of keeping them a long way 
> > apart.
> > 
> > > In fact, directory locality is loosely assureed.
> > 
> > Almost the opposite is true, especially if you don't make a common 
> > practice of deleting lots of directories.
> 
> I think we are miscommunicating here.
> 
> Directory *contents* are loosely assured locality.
> 
> Directory *heirarchies* are loosely assured near-perfect random
> non-locality.

I seem to recall that's what I said last time.

> Per the "find" command argument, this issue is "breadth-first" vs.
> "depth-first".

You need three dimensions; breadth is distance across siblings, depth 
is distance from parent through children, and width is individual 
directory size.  

Directories are associated width-first, not breadth-first or 
depth-first.  Both breadth and depth are pessimised.

> Either reimplement the directory code as a btree, or, better, "avoid
> writing code that results in diretory recursions".

The first has the effect of clustering directory data.  Funnily enough, 
that's what I was attempting to achieve.

> I think the ports problem in this respect is that the directory
> entry cache is too small for the traversal being done.

No, the ports problem is that all directories in the hierarchy will be 
visited, and you have to seek a long way from any one entry to almost 
any other entry.

> Basically, "some idiot is using the FS directory hierarchy as if it
> were a database index".  8-).

Feel free to suggest an alternative.  Code complete, of course.

> The correct thing to do here would probably be to create a "fast find"
> index, and have find use this (SunOS 4.1 has this; you create the
> index at 3 am from cron -- 3am because that's when you break to go
> to Naugles for egg-and-bean burritos 8-)).
>
> Alternately, a "portfinder" port would help...

You're still only reading what you want to see.  This sort of search is 
already trivialised; we keep an index (ports/INDEX), and have both a 
commandline ('make search key=') and graphical (sysutils/pib) search 
tool.

With the ports collection, the issue is creation and backups.  The same
problem hits other applications though (anything involving the CVS
repository, for example). 

-- 
\\  Sometimes you're ahead,       \\  Mike Smith
\\  sometimes you're behind.      \\  mike@smith.net.au
\\  The race is long, and in the  \\  msmith@freebsd.org
\\  end it's only with yourself.  \\  msmith@cdrom.com



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199809022335.XAA00706>