Date: Wed, 02 Sep 1998 23:35:07 +0000 From: Mike Smith <mike@smith.net.au> To: Terry Lambert <tlambert@primenet.com> Cc: mike@smith.net.au (Mike Smith), cmascott@world.std.com, hackers@FreeBSD.ORG Subject: Re: Reading/writing /usr/ports is VERY slow Message-ID: <199809022335.XAA00706@word.smith.net.au> In-Reply-To: Your message of "Thu, 03 Sep 1998 02:35:32 GMT." <199809030235.TAA07404@usr07.primenet.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> > This is only useful if the layout has clustered related directories on > > the disk. The current code does a good job of keeping them a long way > > apart. > > > > > In fact, directory locality is loosely assureed. > > > > Almost the opposite is true, especially if you don't make a common > > practice of deleting lots of directories. > > I think we are miscommunicating here. > > Directory *contents* are loosely assured locality. > > Directory *heirarchies* are loosely assured near-perfect random > non-locality. I seem to recall that's what I said last time. > Per the "find" command argument, this issue is "breadth-first" vs. > "depth-first". You need three dimensions; breadth is distance across siblings, depth is distance from parent through children, and width is individual directory size. Directories are associated width-first, not breadth-first or depth-first. Both breadth and depth are pessimised. > Either reimplement the directory code as a btree, or, better, "avoid > writing code that results in diretory recursions". The first has the effect of clustering directory data. Funnily enough, that's what I was attempting to achieve. > I think the ports problem in this respect is that the directory > entry cache is too small for the traversal being done. No, the ports problem is that all directories in the hierarchy will be visited, and you have to seek a long way from any one entry to almost any other entry. > Basically, "some idiot is using the FS directory hierarchy as if it > were a database index". 8-). Feel free to suggest an alternative. Code complete, of course. > The correct thing to do here would probably be to create a "fast find" > index, and have find use this (SunOS 4.1 has this; you create the > index at 3 am from cron -- 3am because that's when you break to go > to Naugles for egg-and-bean burritos 8-)). > > Alternately, a "portfinder" port would help... You're still only reading what you want to see. This sort of search is already trivialised; we keep an index (ports/INDEX), and have both a commandline ('make search key=') and graphical (sysutils/pib) search tool. With the ports collection, the issue is creation and backups. The same problem hits other applications though (anything involving the CVS repository, for example). -- \\ Sometimes you're ahead, \\ Mike Smith \\ sometimes you're behind. \\ mike@smith.net.au \\ The race is long, and in the \\ msmith@freebsd.org \\ end it's only with yourself. \\ msmith@cdrom.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199809022335.XAA00706>