Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 18 Oct 2006 16:40:43 +1000 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Chuck Lever <chucklever@gmail.com>
Cc:        mjacob@FreeBSD.org, fs@FreeBSD.org, Kris Kennaway <kris@obsecurity.org>
Subject:   negative cache hits for nfs (was: cvs commit: ...)
Message-ID:  <20061018153336.E72684@delplex.bde.org>
In-Reply-To: <20061017113943.C67620@delplex.bde.org>
References:  <200610140725.k9E7PC37008454@repoman.freebsd.org>  <20061014231502.GA38708@rink.nu> <20061015105809.M59123@delplex.bde.org>  <20061015051044.GA42764@xor.obsecurity.org> <20061014222221.H97880@ns1.feral.com> <20061014222437.N4701@ns1.feral.com> <20061015153454.G59979@delplex.bde.org> <76bd70e30610150837w61689cf6ya2499d100a15c3e8@mail.gmail.com>  <20061016164122.S63585@delplex.bde.org> <76bd70e30610160620x67e5d3a5j938c26744d0b9759@mail.gmail.com> <20061017113943.C67620@delplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
[I changed the Cc from cvs* to fs]

On Tue, 17 Oct 2006, Bruce Evans wrote:

> On Mon, 16 Oct 2006, Chuck Lever wrote:
>
>> On 10/16/06, Bruce Evans <bde@zeta.org.au> wrote:
>>> On Sun, 15 Oct 2006, Chuck Lever wrote:
>>>> [An independent imeout for the access cache isn't useful.]
>>> 
>>> I'll try removing the special support for the access cache timeout in
>>> rc.conf first.
>> 
>> OK.  I can review patches if you think that would help, but I can't
>> contribute code at the moment because of IP issues at my current
>> employer.  Hopefully that will change soon.
>
> Thanks.  Removing it in rc.conf won't require review :-).

>> ...
>> Another thing to consider is that a LOOKUP is usually more expensive
>> for servers than a GETATTR.  If your client has already cached lookup
>> results for the file to be opened, you can get away with a GETATTR on
>> the parent directory to verify that it has not changed, and that will
>> almost always be faster than doing a full LOOKUP.
>
> FreeBSD's client is doing not very good things for Lookup too.  It is
> missing caching of negative lookups.  make(1) likes to do a lot of
> negative lookups...  NetBSD fixed this in 1997, sigh.

Here is a merge of some bits from NetBSD for review.  It is mostly the
1997 version, with updates to use timespecs instead of time_t's, but
not updates to use changes that don't seem to be related to correctness,
or ones less than 18 months old (if any).

% Index: nfs_vnops.c
% ===================================================================
% RCS file: /home/ncvs/src/sys/nfsclient/nfs_vnops.c,v
% retrieving revision 1.270
% diff -u -2 -r1.270 nfs_vnops.c
% --- nfs_vnops.c	14 Oct 2006 07:25:11 -0000	1.270
% +++ nfs_vnops.c	18 Oct 2006 01:41:14 -0000
% @@ -852,7 +869,17 @@
%  		return (error);
%  	}
% -	if ((error = cache_lookup(dvp, vpp, cnp)) && error != ENOENT) {
% +	if ((error = cache_lookup(dvp, vpp, cnp)) != 0) {
%  		struct vattr vattr;
% 
% +		if (error == ENOENT) {
% +			/* Negative cache hit.  Use it unless stale. */
% +			if (VOP_GETATTR(dvp, &vattr, cnp->cn_cred, td) == 0 &&
% +			    timespeccmp(&vattr.va_mtime, &np->n_nctime, ==))
% +				return (ENOENT);
% +
% +			cache_purge(dvp);
% +			timespecclear(&np->n_nctime);
% +			goto dorpc;
% +		}
%  		newvp = *vpp;
%  		if (!VOP_GETATTR(newvp, &vattr, cnp->cn_cred, td)
% @@ -871,4 +898,5 @@
%  		*vpp = NULLVP;
%  	}
% +dorpc:
%  	error = 0;
%  	newvp = NULLVP;
% @@ -951,4 +979,11 @@
%  nfsmout:
%  	if (error) {
% +		if (error == ENOENT && (cnp->cn_flags & MAKEENTRY) &&
% +		    cnp->cn_nameiop != CREATE) {
% +			/* Negative cache entry. */
% +			if (!timespecisset(&np->n_nctime))
% +				np->n_nctime = np->n_vattr.va_mtime;
% +			cache_enter(dvp, NULL, cnp);
% +		}
%  		if (newvp != NULLVP) {
%  			vput(newvp);
% @@ -1931,6 +1966,9 @@
%  		if (newvp)
%  			vput(newvp);
% -	} else
% +	} else {
% +		if (cnp->cn_flags & MAKEENTRY)
% +			cache_enter(dvp, newvp, cnp);
%  		*ap->a_vpp = newvp;
% +	}
%  	return (error);
%  }
% Index: nfsnode.h
% ===================================================================
% RCS file: /home/ncvs/src/sys/nfsclient/nfsnode.h,v
% retrieving revision 1.59
% diff -u -2 -r1.59 nfsnode.h
% --- nfsnode.h	13 Sep 2006 18:39:09 -0000	1.59
% +++ nfsnode.h	18 Oct 2006 00:48:44 -0000
% @@ -100,4 +100,5 @@
%  	struct timespec		n_mtime;	/* Prev modify time. */
%  	time_t			n_ctime;	/* Prev create time. */
% +	struct timespec		n_nctime;	/* Last neg cache entry (dir) */
%  	time_t			n_expiry;	/* Lease expiry time */
%  	nfsfh_t			*n_fhp;		/* NFS File Handle */

For building kernels, this gives a larger speedup than everything that
I tried short of completely dropping cto consistency, provided dotdot
caching in vfs_cache.c isn't lost.  The following benchmarks are also
with zapping of the attribute cache turned off in nfs_close() to avoid
doubled Getattr's in open() without breaking cto consistency.

Times and nfsstats are for the second run of "make depend; sync" and
"make; sync " after "make clean cleandend; sync; sleep 1" in each run,
with a RELENG_4 kernel sources, ~RELENG_5 userland and -current+ kernel,
sources and obj and /usr on nfs, unloaded network latency 100uS, ...

Before:
        12.75 real         5.19 user         1.58 sys
  Lookup Read Write Access Getattr Other   Total
   14203  548   599  21561     454    97   37462
        78.80 real        62.01 user         4.45 sys
  Lookup Read Write Create Access Fsstat Other   Total
   19543 2410  5353    442  24241   1742    14   53745

After:
        10.68 real         5.20 user         1.42 sys
  Lookup Read Write Access Getattr Other   Total
    1268  548   599  21575     454   112   24556
        76.38 real        62.00 user         4.28 sys
  Lookup Read Write Create Access Fsstat Other   Total
    4122 2410  5353    442  24222   1750    14   38313

The number of Lookups has been reduced by a factor of 11+ for make -n
and 5- for make.

With lost dotdot caching, After:
        11.02 real         5.25 user         1.40 sys
  Lookup Read Write Access Getattr Other   Total
    3031  548   599  21574     453   112   26317
        84.19 real        62.20 user         4.71 sys
  Lookup Read Write Create Access Fsstat Other   Total
   45063 2410  5353    442  24290   1750    14   79322

This does another 40000+ Lookups, much the same as Before, but ending
up with 45000+ instead of 59000+.

Of course things are slower with cold caches, but they aren't much
slower (less than what losing dotdot caching costs).  According to
vfs.cache.numcache, there are less than 2000 files to look up, so
even 4122 Lookups is a lot.  (vfs.cache.numcache was 1482 and
vfs.cache.numneg was 103 after the run that produced the above
statistics.  This includes a few other files looked up since
rebooting a few minuts earlier.  I wasn't careful about starting
from scratch without dotdot caching).

With cto consistency completely turned off, without lost dotdot caching,
After:
         7.54 real         5.17 user         1.13 sys
  Lookup Read Write Access Getattr Other   Total
    1284  548   599   2161     454   112    5158
        72.67 real        61.95 user         3.82 sys
  Lookup Read Write Create Access Fsstat Other   Total
    4799 2410  5353    442   1529   1750    14   16297

This reduces the Access count by a factor of almost 18.

Now another pessimization is more obvious -- why should building kernels
be doing all those Fsstat's?  I haven't located them for sure, but
that opendir always calls statfs(2) for the silly purpose of determining
whether it needs to do extra work to support unionfs.

For comparison:

>From now on, kernel source and obj are on a local file system.  nfs
is still on /usr, so there are a few RPCs for execing things.

Before (without lost dotdot caching, with cto consistency):
         6.39 real         5.04 user         1.08 sys
  Lookup Access Other   Total
     865    651     3    1519
        66.86 real        61.42 user         4.45 sys
  Lookup Access Other   Total
    3115   1350     1    4466

Before (without lost dotdot caching, and _without_ cto consistency):
         6.17 real         5.25 user         0.88 sys
  Other   Total
     86      86
        66.61 real        61.54 user         4.74 sys
  Other   Total
     94      94

The differences that the nfs changes make with just /usr on nfs are
measureable but tiny.  At least in my configuration.  All my executables
are statically linked, else the extra opens for cto consistency of the
shared libraries would give many more than 1350 Access RPCs.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20061018153336.E72684>