From owner-freebsd-fs  Thu Apr 24 10:39:10 1997
Return-Path: <owner-fs>
Received: (from root@localhost)
          by hub.freebsd.org (8.8.5/8.8.5) id KAA19851
          for fs-outgoing; Thu, 24 Apr 1997 10:39:10 -0700 (PDT)
Received: from haldjas.folklore.ee (Haldjas.folklore.ee [193.40.6.121])
          by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id KAA19644
          for <fs@freebsd.org>; Thu, 24 Apr 1997 10:34:50 -0700 (PDT)
Received: from localhost (narvi@localhost)
          by haldjas.folklore.ee (8.8.4/8.8.4) with SMTP
	  id UAA18267; Thu, 24 Apr 1997 20:24:13 +0300 (EEST)
Date: Thu, 24 Apr 1997 20:24:11 +0300 (EEST)
From: Narvi <narvi@haldjas.folklore.ee>
To: David Greenman <dg@root.com>
cc: Poul-Henning Kamp <phk@dk.tfs.com>, Michael Hancock <michaelh@cet.co.jp>,
        fs@freebsd.org
Subject: Re: the namei cache... 
In-Reply-To: <199704241208.FAA09111@root.com>
Message-ID: <Pine.BSF.3.95.970424200621.17927A-100000@haldjas.folklore.ee>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-fs@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk



On Thu, 24 Apr 1997, David Greenman wrote:

> >>With hashing you can work on the hashing algorithm.  Btw, what is the hash
> >>key now?  vp+name[20]?
> >
> >directory vnode v_id + the regular namei hash.
> 
>    Calculating the hash is expensive since it involves doing a byte sum of
> all of the characters in the name, adding in the directory v_id, and dividing
> by a prime number. There's got to be a better way. I haven't carefully read
 
How about not adding byte values but say, long values? We may have to keep
upto 3 bytes of additional space (avoiding overuns) but get it done in
about 1/4th of the additions. Getting reed of the div wouldn't also be bad
but might not be worth it.

	Sander

> what Poul-Henning is proposing, but the gist I got was that he intends to
> hang the file name cache entries off of the directory vnode? It seems to
> me that this works well for small directories and poorly for large
> directories unless you construct a binary tree and keep the entries sorted.
> Then the problem becomes managing the tree (keeping it balanced, etc), which
> itself can be expensive...although lookups happen far more often than creates
> and deletes.
>    It's interesting that the single largest CPU consumer on wcarchive appears
> to be the namei cache lookups. I think the hash algorithm needs to be
> re-visited at the very least, perhaps changing the divide by a prime into
> some sort of xor of the filename characters (xored in pairs as int16's).
> 
> -DG
> 
> David Greenman
> Core-team/Principal Architect, The FreeBSD Project
>