From owner-cvs-src@FreeBSD.ORG  Sun Oct 15 07:09:45 2006
Return-Path: <owner-cvs-src@FreeBSD.ORG>
X-Original-To: cvs-src@FreeBSD.org
Delivered-To: cvs-src@FreeBSD.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id AD85816A407;
	Sun, 15 Oct 2006 07:09:45 +0000 (UTC) (envelope-from bde@zeta.org.au)
Received: from mailout1.pacific.net.au (mailout1-3.pacific.net.au [61.8.2.210])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 117DC43D68;
	Sun, 15 Oct 2006 07:09:44 +0000 (GMT) (envelope-from bde@zeta.org.au)
Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au
	[61.8.2.163])
	by mailout1.pacific.net.au (Postfix) with ESMTP id AA17E5A7E24;
	Sun, 15 Oct 2006 17:09:42 +1000 (EST)
Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246])
	by mailproxy2.pacific.net.au (Postfix) with ESMTP id E997C27411;
	Sun, 15 Oct 2006 17:09:40 +1000 (EST)
Date: Sun, 15 Oct 2006 17:09:34 +1000 (EST)
From: Bruce Evans <bde@zeta.org.au>
X-X-Sender: bde@delplex.bde.org
To: mjacob@FreeBSD.org
In-Reply-To: <20061014222437.N4701@ns1.feral.com>
Message-ID: <20061015153454.G59979@delplex.bde.org>
References: <200610140725.k9E7PC37008454@repoman.freebsd.org>
	<20061014231502.GA38708@rink.nu>
	<20061015105809.M59123@delplex.bde.org>
	<20061015051044.GA42764@xor.obsecurity.org>
	<20061014222221.H97880@ns1.feral.com>
	<20061014222437.N4701@ns1.feral.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: cvs-src@FreeBSD.org, src-committers@FreeBSD.org, cvs-all@FreeBSD.org,
	Kris Kennaway <kris@obsecurity.org>
Subject: Re: cvs commit: src/sys/nfsclient nfs_vnops.c
X-BeenThere: cvs-src@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: CVS commit messages for the src tree <cvs-src.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-src>,
	<mailto:cvs-src-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/cvs-src>
List-Post: <mailto:cvs-src@freebsd.org>
List-Help: <mailto:cvs-src-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-src>,
	<mailto:cvs-src-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 15 Oct 2006 07:09:45 -0000

On Sat, 14 Oct 2006 mjacob@FreeBSD.org wrote:

> On Sat, 14 Oct 2006, mjacob@FreeBSD.org wrote:
>
>> So, inquiring minds want to know why defaults/rc.conf still sets this to 2 
>> then.
>
> *smack*- I meant in RE_6.

Seems to be just because no one got around to it in RELENG_6.

Recovering some context:

On a date too hard for me to recover, Kris wrote:

>> On Sun, Oct 15, 2006 at 12:37:28PM +1000, Bruce Evans wrote:
>> 
>> > ISTR a discussion of fixing this in rc.conf, but nothing seems to have been
>> > committed.  The setting is confusing in the kernel too:
>> 
>> ----------------------------
>> revision 1.285
>> date: 2006/05/24 00:06:14;  author: kris;  state: Exp;  lines: +1 -1
>> Increase the nfs access cache timeout from 2 to 60.  The latter is a
>> more appropriate value and is also the default set by the kernel.  I
>> could not find a justification of why rc.conf began overriding it back
>> in 1998.
>> 
>> This dramatically cuts NFS traffic on e.g. a busy system with NFS root.
>> 
>> Reviewed by:    mohans
>> MFC After:      2 weeks
>> ----------------------------

Thanks.  I must have been confused about which machine I was on when I
grepped for this.  In the FreeBSD cluster, it is on a -current machine
but not on a 6.1 machine.

My previous mail more or less explained why rc.conf began setting it to
2 in 1998:  It didn't exist before then, so it was initially set to a
conservative default of 2.  Only the mount options for the _attribute_
cache existed before then.  rc.conf and fstab never had any special
support for these, so I think rc.conf shouldn't have any special support
for the _access_ cache timeout (it now defaults to setting it to its
kernel default value).

I just noticed even sillier configuration bogusness for the default
attribute cache timeouts:
- in 1994, the default timeouts were ifdefed.  I think there was no
   other (easy) way to change the timeouts.
- in May 1998, mount_nfs started supporting setting the timeouts per-mount.
- in June 1998, 6 weeks after 1994 hack became unnecesary, the timeouts
   were turned into first class kernel options (put in an options header).

I just got around to looking at some nfs RFCs.  nfs4 has a lot to say
about the problem of too many RPCs.  nfs3 has less to say.

I think I now know how to fix the second largest source of extra RPCs
properly:

We used clear the _attribute_ cache (and the access cache as a side
effect?) at the _end_ of nfs_open(), but we are supposed to clear it
at the _start_ of every open().  We can't do the latter properly because
a fresh set of attributes are needed or at least preferred when
nfs_lookup() is called before nfs_open().  Clearing the attribute cache
at the end of nfs_open() seems to be just a hack which gives a good
chance of the clearing living until the next open().  It doesn't always
work.  Clearing the cache in nfs_close() is further from always working.
It fails if the file is re-open()ed before the first open() instance
is close()d, and wasn't done.

Now we clear the attribute cache at the start of nfs_open() and clear
it in nfs_close().  For simple open-close sequences, this gives 2
clearings and 2 refreshes.  First, nfs_lookup() refreshes; then
nfs_open() clears and refreshes; finally, nfs_close() clears.

I think the correct refreshing is:
- force a clear in nfs_lookup() only if we are sure that the lookup is for
   open().  Then refresh normally (if we just cleared or the cache was
   already clear).  Implement this using a namei() flag?
- force a clear in nfs_open() only if we aren't sure that nfs_lookup()
   didn't already do it.  This is for safety in cases like core dumps
   where namei() isn't called by open(2) and we forget to tell namei()
   that the lookup is for open().  Then refresh normally.  Implement this
   using a generation count or another flag?

Bruce