Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 25 Apr 1997 10:11:21 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        phk@dk.tfs.com (Poul-Henning Kamp)
Cc:        bakul@torrentnet.com, phk@dk.tfs.com, hackers@freebsd.org
Subject:   Re: the namei cache...
Message-ID:  <199704251711.KAA03608@phaeton.artisoft.com>
In-Reply-To: <2038.861948587@critter> from "Poul-Henning Kamp" at Apr 25, 97 08:09:47 am

next in thread | previous in thread | raw e-mail | index | archive | help
> >This is analogous to one of the ways one implements a symbol table
> >in a lexically scoped language processing program.
> 
> But these programs don't work with a finite bounded number of
> entries, so reuse policies doesn't matter to them.

I don't necessarily agree with this... he's not suggesting btree's
or something else that needs balancing.


> >Scaling.  Directories with 100+ entries are not uncommon.  Even
> >/usr/include and /usr/include/sys have over 100 entries each.
> 
> You obviously don't know how the name cache operates.  Only names
> you lookup ends up in the cache, it's not the entire directory
> that gets put into the cache (unless you do a "ls -l" that is).

This is true.  Unless the inode is brought in core, a vnode is
associated with the inode, and the vnode and name are associated
in the cache, a cache entry is not made.

One exception: negative cache entries.  They are created on name
lookup misses that do not result in vnode hits.  You *could* end up
with a lot of these.

In some locality of reference models, it's actually valuable to
fault all the related inodes into core.  This is less expensive
on systems that use device/offset rather than vnode/offset page
identification, since you end up with 4k/128 or 32 of the things
in core anyway.

This is typically a win if your VFS consumer is going to return
inode data with the directory entry.  For high latency links,
this is valuable (NFS).  It's also valuable for lookup/open
interfaces, such as those exported by SMB, ATP, and NCP servers.


> >once encountered a directory with 2,000,000+ entries!  One does not
> >optimize for such border cases but it is nice when they are handled
> >effortlessly as a byproduct of a design decision.
>
> For that case there will be no difference at all, even wcarchive only
> has 44000 entries in the cache.

In fairness, I think Bakul's claim loses water here, if you define
"abusing the FS directory structure as a database" as bad practice.
Of course, until you get rid of terminfo, that may be hard to do.  8-).


> >A dynamically growing hashtable is a must.
> 
> Hello Houston ?  We have lost gravity!  Of course we can't do that
> in the kernel.  Memory is way too expensive.

???

The name cache already uses a hash table.  The difference is really
computational, isn't it, in reordering the hash lists on growth.  The
hash *should be proportional to the number of buckets, and if you
grow the buckets, it makes sense to grow the hash.  For something
like the name cache, you can live with it not being terribly sparse;
you ough to implement two types of checksum however, and use one to
hash and one to differentiate the compares, if you aren't going to
bump up the hash table size when you bump up the number of buckets.


> <SOAPBOX>
> I wish all of these "instant-fs" specialists would read up on their
> subject matter before they jump in with their misunderstandings!
> </SOAPBOX>

Bakul has always struck me as someone with a firm knowledge of commercial
kernels.  He may have pressed one of your hot buttons, but that doesn't
mean you should treat him as a clueless newby.  Better to enlighten
him (assuming you're right and he's wrong; otherwise better to let him
enlighten you) than to shout him down.


					Regards,
					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199704251711.KAA03608>