Date: Thu, 24 Apr 1997 05:56:01 -0700 From: Don Lewis <Don.Lewis@tsc.tdk.com> To: dg@root.com, Poul-Henning Kamp <phk@dk.tfs.com> Cc: Michael Hancock <michaelh@cet.co.jp>, fs@freebsd.org Subject: Re: the namei cache... Message-ID: <199704241256.FAA11674@salsa.gv.tsc.tdk.com> In-Reply-To: David Greenman <dg@root.com> "Re: the namei cache..." (Apr 24, 5:08am)
next in thread | previous in thread | raw e-mail | index | archive | help
On Apr 24, 5:08am, David Greenman wrote: } Subject: Re: the namei cache... } >>With hashing you can work on the hashing algorithm. Btw, what is the hash } >>key now? vp+name[20]? } > } >directory vnode v_id + the regular namei hash. } } Calculating the hash is expensive since it involves doing a byte sum of } all of the characters in the name, adding in the directory v_id, and dividing } by a prime number. It shouldn't be too bad because of the locality, so there aren't many cache misses. } There's got to be a better way. I haven't carefully read } what Poul-Henning is proposing, but the gist I got was that he intends to } hang the file name cache entries off of the directory vnode? It seems to } me that this works well for small directories and poorly for large } directories unless you construct a binary tree and keep the entries sorted. Doing lots of pointer chasing is slow ... } Then the problem becomes managing the tree (keeping it balanced, etc), which } itself can be expensive...although lookups happen far more often than creates } and deletes. At the filesystem level there are far more lookups, but as uncached files are looked up, other entries have to be kicked out of the cache, so you'll end up with a much higher rate of addition and deletion of entries in the cache than in the file system. } It's interesting that the single largest CPU consumer on wcarchive appears } to be the namei cache lookups. It might be interesting to know how long the hash chains are and how well balanced they are. Is the table size appropriate for for the number of vnodes that are in use? } I think the hash algorithm needs to be } re-visited at the very least, perhaps changing the divide by a prime into } some sort of xor of the filename characters (xored in pairs as int16's). Yes, most processors are really bad at integer divide. You might look at the Perl hash. You multiply the current value by 33 and add the next character, then mask off however many bits you want at the end (the table size must be a power of 2). If your compiler is at all smart, it will notice that the multiply is just a shift and an add. I suspect this might be pretty easy to just drop in to the existing code for a quick trial. --- Truck
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199704241256.FAA11674>