From owner-freebsd-fs@FreeBSD.ORG Mon May 24 15:50:50 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 68276106566C; Mon, 24 May 2010 15:50:50 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id C6A3E8FC12; Mon, 24 May 2010 15:50:49 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AvsEAMo8+kuDaFvH/2dsb2JhbACeC3HAGIJuCIIdBA X-IronPort-AV: E=Sophos;i="4.53,292,1272859200"; d="scan'208";a="77489890" Received: from danube.cs.uoguelph.ca ([131.104.91.199]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 24 May 2010 11:50:47 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by danube.cs.uoguelph.ca (Postfix) with ESMTP id 0F8C41084123; Mon, 24 May 2010 11:50:48 -0400 (EDT) X-Virus-Scanned: amavisd-new at danube.cs.uoguelph.ca Received: from danube.cs.uoguelph.ca ([127.0.0.1]) by localhost (danube.cs.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RL43UYd4YmHN; Mon, 24 May 2010 11:50:46 -0400 (EDT) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102]) by danube.cs.uoguelph.ca (Postfix) with ESMTP id D284A10840B6; Mon, 24 May 2010 11:50:46 -0400 (EDT) Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id o4OG6HD02756; Mon, 24 May 2010 12:06:17 -0400 (EDT) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Mon, 24 May 2010 12:06:16 -0400 (EDT) From: Rick Macklem X-X-Sender: rmacklem@muncher.cs.uoguelph.ca To: John Baldwin In-Reply-To: <201005211039.04108.jhb@freebsd.org> Message-ID: References: <201005191144.00382.jhb@freebsd.org> <201005200922.17245.jhb@freebsd.org> <201005211039.04108.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Rick Macklem , Robert Watson , fs@freebsd.org Subject: Re: [PATCH] Better handling of stale filehandles in open() in the NFS client X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 May 2010 15:50:50 -0000 On Fri, 21 May 2010, John Baldwin wrote: > No, it's not the lock, it's this thing Mohan added here in nfs_open(): > > struct thread *td = curthread; > > if (np->n_ac_ts_syscalls != td->td_syscalls || > np->n_ac_ts_tid != td->td_tid || > td->td_proc == NULL || > np->n_ac_ts_pid != td->td_proc->p_pid) { > np->n_attrstamp = 0; > KDTRACE_NFS_ATTRCACHE_FLUSH_DONE(vp); > } > > Which used to be an unconditional clearing of n_attrstamp so that the > VOP_GETATTR() in nfs_open() would always go over the wire. Now it doesn't > clear the attributes if the attribute cache was last updated during the same > system call that is invoking nfs_open() by the same thread. > > Hmm, however, concurrent lookups of the same pathname may cause this test to > fail and still clear n_attrstamp since we now use shared vnode locks for > pathname lookups. That was true before my change as well. In fact, using > shared vnode locks for read-only opens (the MNTK_EXTENDED_SHARED flag) > probably does cause Mohan's patch to not work in many cases which probably > accounts for the small increase in RPC counts. > > Try setting the vfs.lookup_shared sysctl to 0 to see if that removes all the > extra RPCs. Another change would be to readd the change to flush the > attribute cache on close and see if that also brings back extra > attribute/access RPCs without my patch (but with lookup_shared left enabled) > to verify that shared lookups are the root cause of extra RPCs. > Ok, I've played with this a bit and here's what I've seen: - Disabling vfs.lookup_shared had no effect. Since I'm running a single thread (no "-j" on the make) ad nothing else on the machine during the tests, that seems to make sense. - When I added some printfs, here's what seems to be happening. snippet of nfs_lookup(): if ((cnp->cn_flags & (ISLASTCN | ISOPEN)) == (ISLASTCN | ISOPEN)) { newnp = VTONFS(newvp); mtx_lock(&newnp->n_mtx); newnp->n_attrstamp = 0; mtx_unlock(&newnp->n_mtx); } if (!VOP_GETATTR(newvp, &vattr, cnp->cn_cred) && vattr.va_ctime.tv_sec == newnp->n_ctime) { nfsstats.lookupcache_hits++; if (cnp->cn_nameiop != LOOKUP && (flags & ISLASTCN)) cnp->cn_flags |= SAVENAME; return (0); } --> gets here about 9,000 times --> As far as I can see, these getattrs are part of the price you pay for your patch. (Cache misses that aren't caused by VOP_GETATTR() failing.) snippet of nfs_open(): if (np->n_flag & NMODIFIED) { ... --> gets here about 6,000 times. The extra Getattrs for this case can be avoied by adding if ((newnp->n_flag & NMODIFIED) == 0) for setting newnp->n_attrstamp = 0; above. } else { struct thread *td = curthread; if (np->n_ac_ts_syscalls != td->td_syscalls || np->n_ac_ts_tid != td->td_tid || td->td_proc == NULL || np->n_ac_ts_pid != td->td_proc->p_pid) { np->n_attrstamp = 0; KDTRACE_NFS_ATTRCACHE_FLUSH_DONE(vp); --> never gets here, which makes sense since your patch forces the Getattr to be done in nfs_lookup(). I wouldn't say that this can "never" happen with the patch applied, because of concurrency with shared vnode locks, but it isn't happening for this test. } mtx_unlock(&np->n_mtx); error = VOP_GETATTR(vp, &vattr, ap->a_cred); So, I can see where about 15,000 of the extra Getattrs occur, but haven't nailed down the other 20,000. I do see a lot of cases where the namei() call in vn_open_cred() at line#189 of vfs_vnops.c returns an error. I haven't figured out if those account for the other 20,000. (ie. are there cases where namei()/nfs_lookup() fails after having forced the Getattr) Since namei() fails, VOP_OPEN() doesn't get called. In summary, I haven't nailed down where 20,000 of the extra Getattr RPCs happen, but if it still achieves what you want, adding the if ((newnp->n_flag & NMODIFIED) == 0) before newnp->n_attrstamp = 0; gets rid of some of them. I'm going to run some tests with Readdirplus enabled, to see what effect the patch has for that case, rick.