Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 24 May 2010 12:06:16 -0400 (EDT)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        John Baldwin <jhb@freebsd.org>
Cc:        Rick Macklem <rmacklem@freebsd.org>, Robert Watson <rwatson@freebsd.org>, fs@freebsd.org
Subject:   Re: [PATCH] Better handling of stale filehandles in open() in the NFS client
Message-ID:  <Pine.GSO.4.63.1005241143160.29428@muncher.cs.uoguelph.ca>
In-Reply-To: <201005211039.04108.jhb@freebsd.org>
References:  <201005191144.00382.jhb@freebsd.org> <201005200922.17245.jhb@freebsd.org> <Pine.GSO.4.63.1005202036570.15248@muncher.cs.uoguelph.ca> <201005211039.04108.jhb@freebsd.org>

index | next in thread | previous in thread | raw e-mail



On Fri, 21 May 2010, John Baldwin wrote:


> No, it's not the lock, it's this thing Mohan added here in nfs_open():
>
> 		struct thread *td = curthread;
>
> 		if (np->n_ac_ts_syscalls != td->td_syscalls ||
> 		    np->n_ac_ts_tid != td->td_tid ||
> 		    td->td_proc == NULL ||
> 		    np->n_ac_ts_pid != td->td_proc->p_pid) {
> 			np->n_attrstamp = 0;
> 			KDTRACE_NFS_ATTRCACHE_FLUSH_DONE(vp);
> 		}
>
> Which used to be an unconditional clearing of n_attrstamp so that the
> VOP_GETATTR() in nfs_open() would always go over the wire.  Now it doesn't
> clear the attributes if the attribute cache was last updated during the same
> system call that is invoking nfs_open() by the same thread.
>
> Hmm, however, concurrent lookups of the same pathname may cause this test to
> fail and still clear n_attrstamp since we now use shared vnode locks for
> pathname lookups.  That was true before my change as well.  In fact, using
> shared vnode locks for read-only opens (the MNTK_EXTENDED_SHARED flag)
> probably does cause Mohan's patch to not work in many cases which probably
> accounts for the small increase in RPC counts.
>
> Try setting the vfs.lookup_shared sysctl to 0 to see if that removes all the
> extra RPCs.  Another change would be to readd the change to flush the
> attribute cache on close and see if that also brings back extra
> attribute/access RPCs without my patch (but with lookup_shared left enabled)
> to verify that shared lookups are the root cause of extra RPCs.
>
Ok, I've played with this a bit and here's what I've seen:
- Disabling vfs.lookup_shared had no effect. Since I'm running a single
   thread (no "-j" on the make) ad nothing else on the machine during the
   tests, that seems to make sense.

- When I added some printfs, here's what seems to be happening.
   snippet of nfs_lookup():
 		if ((cnp->cn_flags & (ISLASTCN | ISOPEN)) ==
 		    (ISLASTCN | ISOPEN)) {
 			newnp = VTONFS(newvp);
 			mtx_lock(&newnp->n_mtx);
 			newnp->n_attrstamp = 0;
 			mtx_unlock(&newnp->n_mtx);
 		}
 		if (!VOP_GETATTR(newvp, &vattr, cnp->cn_cred)
 		    && vattr.va_ctime.tv_sec == newnp->n_ctime) {
 			nfsstats.lookupcache_hits++;
 			if (cnp->cn_nameiop != LOOKUP &&
 			    (flags & ISLASTCN))
 				cnp->cn_flags |= SAVENAME;
 			return (0);
 		}
--> gets here about 9,000 times
     --> As far as I can see, these getattrs are part of the price
         you pay for your patch. (Cache misses that aren't caused by
         VOP_GETATTR() failing.)

snippet of nfs_open():
 	if (np->n_flag & NMODIFIED) {
 		...
--> gets here about 6,000 times. The extra Getattrs for this case can
         be avoied by adding if ((newnp->n_flag & NMODIFIED) == 0)
 	for setting newnp->n_attrstamp = 0; above.
 	} else {
 		struct thread *td = curthread;

 		if (np->n_ac_ts_syscalls != td->td_syscalls ||
 		    np->n_ac_ts_tid != td->td_tid ||
 		    td->td_proc == NULL ||
 		    np->n_ac_ts_pid != td->td_proc->p_pid) {
 			np->n_attrstamp = 0;
 			KDTRACE_NFS_ATTRCACHE_FLUSH_DONE(vp);
--> never gets here, which makes sense since your patch forces the Getattr
     to be done in nfs_lookup(). I wouldn't say that this can "never"
     happen with the patch applied, because of concurrency with shared
     vnode locks, but it isn't happening for this test.
 		}
 		mtx_unlock(&np->n_mtx);
 		error = VOP_GETATTR(vp, &vattr, ap->a_cred);

So, I can see where about 15,000 of the extra Getattrs occur, but haven't
nailed down the other 20,000. I do see a lot of cases where the namei()
call in vn_open_cred() at line#189 of vfs_vnops.c returns an error. I
haven't figured out if those account for the other 20,000. (ie. are there
cases where namei()/nfs_lookup() fails after having forced the Getattr)
Since namei() fails, VOP_OPEN() doesn't get called.

In summary, I haven't nailed down where 20,000 of the extra Getattr RPCs
happen, but if it still achieves what you want, adding the
 	if ((newnp->n_flag & NMODIFIED) == 0)
before 		newnp->n_attrstamp = 0;
gets rid of some of them.

I'm going to run some tests with Readdirplus enabled, to see what effect
the patch has for that case, rick.

























home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.GSO.4.63.1005241143160.29428>