Date: Thu, 05 Oct 2006 17:28:26 -0600 From: Scott Long <scottl@samsco.org> To: Bruce Evans <bde@zeta.org.au> Cc: fs@freebsd.org Subject: Re: lost dotdot caching pessimizes nfs especially Message-ID: <4525951A.1020901@samsco.org> In-Reply-To: <20061006050913.Y5250@epsplex.bde.org> References: <20061006050913.Y5250@epsplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Bruce Evans wrote: > This change: > > % Index: vfs_cache.c > % =================================================================== > % RCS file: /home/ncvs/src/sys/kern/vfs_cache.c,v > % retrieving revision 1.102 > % retrieving revision 1.103 > % diff -u -2 -r1.102 -r1.103 > % --- vfs_cache.c 13 Jun 2005 05:59:59 -0000 1.102 > % +++ vfs_cache.c 17 Jun 2005 01:05:13 -0000 1.103 > % @@ -494,6 +494,16 @@ > % return; > % } > % + /* > % + * For dotdot lookups only cache the v_dd pointer if the > % + * directory has a link back to its parent via v_cache_dst. > % + * Without this an unlinked directory would keep a soft > % + * reference to its parent which could not be NULLd at > % + * cache_purge() time. > % + */ > % if (cnp->cn_namelen == 2 && cnp->cn_nameptr[1] == '.') { > % - dvp->v_dd = vp; > % + CACHE_LOCK(); > % + if (!TAILQ_EMPTY(&dvp->v_cache_dst)) > % + dvp->v_dd = vp; > % + CACHE_UNLOCK(); > % return; > % } > > is responsible for about half of the performance loss since RELENG_4 > for building kernels over nfs (/usr and sys trees on nfs). The kernel > build uses "../../" a lot, and the above change apparently results in > lots of network activity for things that should be cached locally. > > Some times for building a RELENG_4 kernel under conditions invariant > except for the host kernel (after "make clean; sleep 2; make depend; > make; make clean; sleep 2; make depend" to warm up caches): > > kernel: > RELENG_4 77.51 real 60.62 user 4.36 sys > current.2004.07.01 ~78.5 (lost details) > current.2005.01.01 ~79 (lost details) > current.2005.06.17 82.42 real 62.50 user 4.71 sys > current.2005.06.19 89.53 real 62.18 user 5.44 sys > current.2005.06.17+ ~89.5 (lost details) > .17+ = .17 plus above change > current.2005.06.17+* 86.08 real 62.43 user 5.13 sys > .17+* = .17+ with ../.. in Makefile avoided using a symlink > @ -> <path to sys not using ..> > RELENG_6 91.14 real 62.04 user 5.71 sys > current similar to RELENG_6 (lost details) > > The total performance loss is about 18%. > > The total performance loss for a local sys tree (/usr still on nfs) is much > smaller (about 4%): > > RELENG_4 65.19 real 60.50 user 3.95 sys > current.2005.06.17 67.49 real 62.13 user 4.27 sys > RELENG_6 67.83 real 61.84 user 4.71 sys > current similar to RELENG_6 (lost details) > > The nfs performance for building of things that should be entirely > cached locally is very dependent on network latency. Not caching > things very well causes lots of unnecessary network traffic for Getattr > and Lookup. The packets are small, so throughput is unimportant and > latency dominates. For building over nfs without -j, the dead time > (real - user - sys) is almost directly proportional to the latency. > My usual local network has fairly low latency (~100uS unloaded) and > the ~14 seconds dead time in the above is for it. Switching to a 1 > Gbps network with lower quality NICs gives an unloaded latency of ~160uS > and a dead time of ~21 seconds. Building with -j helps even for UP, > at the cost of extra CPU, by letting some processes advance using cached > stuff while others are waiting for the network. Building with -j helps > even more on FreeBSD cluster machines, more because they have a much > higher network latency than because they are SMP. > > Bruce I was starting to look at this a while ago, but had to move onto other things. Do you have any suggestions for a fix? Scott
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4525951A.1020901>