Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 26 May 2001 15:42:29 -0700
From:      Peter Wemm <peter@wemm.org>
To:        areilly@bigpond.net.au (Andrew Reilly), hackers@FreeBSD.ORG
Subject:   Re: technical comparison 
Message-ID:  <20010526224229.B4B97380E@overcee.netplex.com.au>
In-Reply-To: <200105262214.CAA21056@aaz.links.ru> 

next in thread | previous in thread | raw e-mail | index | archive | help
.@babolo.ru wrote:
> Andrew Reilly writes:
> ....
> > /usr/ports/distfiles on any of the mirrors probably contains
> > upwards of 5000 files too, and there is a strong likelyhood that
> > these will be accessed out-of-order by ports-makefile-driven
> > fetch requests.
> Oh!
> You point a good example!
> 0cicuta~(13)>/bin/ls /usr/ports/distfiles/ | wc
>     9672    9672  198244

.. Which is almost entirely stored in the name cache, which is hashed. Once
you scan the directory for the first time, the entries are pre-inserted
into the hash.  This cache is very long lived and is quite effective at
dealing with this sort of thing, especially if you have plenty of memory
and have vfs.vmiodirenable=1 turned on.  While it may not scale too well to
directories with millions of files, it certainly deals well with tens of
thousands of files.  We have recently made improvements to the hashing
algorithms to get better dispersion on small and iterative filenames, eg:
00, 01, 02 -> FF.

It is not perfect, but it is a hell of a lot better than the false
assumption that the linear search method is the usual case.

Which is more expensive?  Maintaining an on-disk hashed (or b+tree)
directory format for *everything* or maintaining a simple low-cost format
on disk with in-memory hashing for fast lookups?  For the small directory
case I suspect the FFS+namecache way is more cost effective.  For the
medium to large directory case (10,000 to 100,000 entries), I suspect the
FFS+namecache method isn't too shabby, providing you are not starved for
memory.  For the insanely large cases - I dont want to think about :-).

Cheers,
-Peter
--
Peter Wemm - peter@FreeBSD.org; peter@yahoo-inc.com; peter@netplex.com.au
"All of this is for nothing if we don't go to the stars" - JMS/B5


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010526224229.B4B97380E>