From owner-freebsd-fs@FreeBSD.ORG Fri May 21 15:51:10 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D3F511065670; Fri, 21 May 2010 15:51:10 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 8F83C8FC1C; Fri, 21 May 2010 15:51:10 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 24F2446B95; Fri, 21 May 2010 11:51:10 -0400 (EDT) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPA id EECAF8A025; Fri, 21 May 2010 11:51:08 -0400 (EDT) From: John Baldwin To: Rick Macklem Date: Fri, 21 May 2010 10:39:04 -0400 User-Agent: KMail/1.12.1 (FreeBSD/7.3-CBSD-20100217; KDE/4.3.1; amd64; ; ) References: <201005191144.00382.jhb@freebsd.org> <201005200922.17245.jhb@freebsd.org> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201005211039.04108.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Fri, 21 May 2010 11:51:08 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.4 required=4.2 tests=AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: Rick Macklem , Robert Watson , fs@freebsd.org Subject: Re: [PATCH] Better handling of stale filehandles in open() in the NFS client X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 May 2010 15:51:11 -0000 On Thursday 20 May 2010 8:45:13 pm Rick Macklem wrote: > > On Thu, 20 May 2010, John Baldwin wrote: > > > > > It doesn't change the RPC count because of changes that Mohan added to the > > NFS client a while ago so that nfs_open() doesn't invalide the attribute > > cache during nfs_open() if it was already updated via nfs_lookup() during > > the same system call. With Mohan's changes in place, all this change does > > is move the GETATTR/ACCESS RPC earlier in the case of a namecache hit. > > > Well, it sounds like a good theory. Something like: > - VOP_LOOKUP() locks the vnode, which is then passed to VOP_OPEN() and > since it is locked, other threads can't perform VOPs on the vp. No, it's not the lock, it's this thing Mohan added here in nfs_open(): struct thread *td = curthread; if (np->n_ac_ts_syscalls != td->td_syscalls || np->n_ac_ts_tid != td->td_tid || td->td_proc == NULL || np->n_ac_ts_pid != td->td_proc->p_pid) { np->n_attrstamp = 0; KDTRACE_NFS_ATTRCACHE_FLUSH_DONE(vp); } Which used to be an unconditional clearing of n_attrstamp so that the VOP_GETATTR() in nfs_open() would always go over the wire. Now it doesn't clear the attributes if the attribute cache was last updated during the same system call that is invoking nfs_open() by the same thread. Hmm, however, concurrent lookups of the same pathname may cause this test to fail and still clear n_attrstamp since we now use shared vnode locks for pathname lookups. That was true before my change as well. In fact, using shared vnode locks for read-only opens (the MNTK_EXTENDED_SHARED flag) probably does cause Mohan's patch to not work in many cases which probably accounts for the small increase in RPC counts. Try setting the vfs.lookup_shared sysctl to 0 to see if that removes all the extra RPCs. Another change would be to readd the change to flush the attribute cache on close and see if that also brings back extra attribute/access RPCs without my patch (but with lookup_shared left enabled) to verify that shared lookups are the root cause of extra RPCs. > I ran a single pass of a kernel "make cleandepend; make depend; make" > here (1 without the patch and one with the patch). Now, it could be > random variation, but since the other RPC counts changed by < 1%, I > suspect not. (I'll run another pass of each to see how much variation > I see w.r.t. Getattr.) > > Here's the counts for the 5 RPCs I think might be interesting: > RPC Count without patch Count with patch > Getattr 590936 625987 +5.9% > Lookup 157194 157528 > Access 59040 59690 +1.1% > Read 70585 70586 > Write 112531 112530 > > I let you know what another pass of each gives, but it looks like > it has caused an increase in RPC cnts. I don't know if the increase > is enough to deter adding the patch, but it might be worth exploring > it more? > > rick > > -- John Baldwin