Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 28 Apr 1997 01:37:44 -0400
From:      Bakul Shah <bakul@torrentnet.com>
To:        Poul-Henning Kamp <phk@dk.tfs.com>
Cc:        hackers@freebsd.org
Subject:   Re: the namei cache... 
Message-ID:  <199704280537.BAA27601@chai.plexuscom.com>
In-Reply-To: Your message of "Fri, 25 Apr 1997 08:09:47 %2B0200." <2038.861948587@critter> 

next in thread | previous in thread | raw e-mail | index | archive | help
> >This is analogous to one of the ways one implements a symbol table
> >in a lexically scoped language processing program.
> 
> But these programs don't work with a finite bounded number of
> entries, so reuse policies doesn't matter to them.

The bound is not so low as in the namei cache case but reuse
policies do matter.  The point of analogy was not to duplicate what
they do but to remind you to check out the range of techniques they
use.  May be some of them are useful?

> >Scaling.  Directories with 100+ entries are not uncommon.  Even
> >/usr/include and /usr/include/sys have over 100 entries each.

> You obviously don't know how the name cache operates.  Only names
> you lookup ends up in the cache, it's not the entire directory
> that gets put into the cache (unless you do a "ls -l" that is).

If anyone stats all the names in /usr/include suddenly you are doing
50+ compares on average.  Hashtables are competitive with linear
linked lists as search structures even for very small table sizes
and far superior for larger tables.  *Even* if you use a fixed size
hashtable (which *will* degenerate  to O(N) comparisons when the
number of entries is far greater than the number of buckets) you
come out ahead.  BTW, I believe AIX or some other commercial Unix
uses hashtables attached to directory entries.

As to the reuse policy, there are a number of possibilities.
[Again, I will point you to look at non OS literature: garbage
collection.]  Here is one example: you can keep an entry on two
lists: the hashtable entry for some directory and a doubly ended
list -- whenever an entry is _referenced_, it is moved to the *head*
of the second list.  Whenever a _free_ entry is needed, you take it
from the *tail* of the second list and move it to the head.  You can
also try something akin to `generational' garbage collectors (in
that a young entry is a candidate for removal.  It is `aged' only
after it is referenced a number of times).  Each solution has its
costs and benefits.  Without some profiling and *hard* staring at
the code I wouldn't know which one makes more sense.

My comment about scaling stands.

> >A dynamically growing hashtable is a must.
> 
> Hello Houston ?  We have lost gravity!  Of course we can't do that
> in the kernel.  Memory is way too expensive.

Sorry, wrong number!  No Angelica, John or Gary here.

Obviously, to grow one data structure you have to take memory from
somewhere else but you are assuming MALLOC/FREE.  If you use an
`arena', a separate memory area, you can recycle old entries and
avoid malloc.

> I wish all of these "instant-fs" specialists would read up on their
> subject matter before they jump in with their misunderstandings!

I wish all of these "instant-optimization" specialists would do some
profiling before they start hacking with their misunderstandings!

A friend of mine and I once implemented a stateless as well as a
stateful remote file system atop 4.2BSD unix for a client (this was
before Sun announced NFS).  That does not make me a FS specialist
and I haven't spent hours in the FS code since then but I am more
familiar with it than you might think.

Try not to assume that anything that does not make sense to you is
automatically meaningless.  Your comment above is especially irksome
because _you_ asked for comments.  I responded only because you are
implementing what (to me) is an obviously suboptimal solution.  Even
if you feel I am wrong *and* a clueless newbie, it behooves you
  a) to politely point out where I am wrong and why, and
  b) to thank me because I took the trouble to comment.
Ah well.

-- bakul



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199704280537.BAA27601>