Date: Sat, 26 May 2001 15:42:29 -0700 From: Peter Wemm <peter@wemm.org> To: areilly@bigpond.net.au (Andrew Reilly), hackers@FreeBSD.ORG Subject: Re: technical comparison Message-ID: <20010526224229.B4B97380E@overcee.netplex.com.au> In-Reply-To: <200105262214.CAA21056@aaz.links.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
.@babolo.ru wrote: > Andrew Reilly writes: > .... > > /usr/ports/distfiles on any of the mirrors probably contains > > upwards of 5000 files too, and there is a strong likelyhood that > > these will be accessed out-of-order by ports-makefile-driven > > fetch requests. > Oh! > You point a good example! > 0cicuta~(13)>/bin/ls /usr/ports/distfiles/ | wc > 9672 9672 198244 .. Which is almost entirely stored in the name cache, which is hashed. Once you scan the directory for the first time, the entries are pre-inserted into the hash. This cache is very long lived and is quite effective at dealing with this sort of thing, especially if you have plenty of memory and have vfs.vmiodirenable=1 turned on. While it may not scale too well to directories with millions of files, it certainly deals well with tens of thousands of files. We have recently made improvements to the hashing algorithms to get better dispersion on small and iterative filenames, eg: 00, 01, 02 -> FF. It is not perfect, but it is a hell of a lot better than the false assumption that the linear search method is the usual case. Which is more expensive? Maintaining an on-disk hashed (or b+tree) directory format for *everything* or maintaining a simple low-cost format on disk with in-memory hashing for fast lookups? For the small directory case I suspect the FFS+namecache way is more cost effective. For the medium to large directory case (10,000 to 100,000 entries), I suspect the FFS+namecache method isn't too shabby, providing you are not starved for memory. For the insanely large cases - I dont want to think about :-). Cheers, -Peter -- Peter Wemm - peter@FreeBSD.org; peter@yahoo-inc.com; peter@netplex.com.au "All of this is for nothing if we don't go to the stars" - JMS/B5 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010526224229.B4B97380E>