Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 3 Oct 2012 00:03:15 -0400
From:      Garrett Wollman <wollman@bimajority.org>
To:        Rick Macklem <rmacklem@uoguelph.ca>
Cc:        freebsd-fs@freebsd.org, rmacklem@freebsd.org, hackers@freebsd.org
Subject:   Re: NFS server bottlenecks
Message-ID:  <20587.47363.504969.926603@hergotha.csail.mit.edu>
In-Reply-To: <499414315.1544891.1349180909058.JavaMail.root@erie.cs.uoguelph.ca>
References:  <20586.27582.478147.838896@hergotha.csail.mit.edu> <499414315.1544891.1349180909058.JavaMail.root@erie.cs.uoguelph.ca>

next in thread | previous in thread | raw e-mail | index | archive | help
[Adding freebsd-fs@ to the Cc list, which I neglected the first time
around...]

<<On Tue, 2 Oct 2012 08:28:29 -0400 (EDT), Rick Macklem <rmacklem@uoguelph.ca> said:

> I can't remember (I am early retired now;-) if I mentioned this patch before:
>   http://people.freebsd.org/~rmacklem/drc.patch
> It adds tunables vfs.nfsd.tcphighwater and vfs.nfsd.udphighwater that can
> be twiddled so that the drc is trimmed less frequently. By making these
> values larger, the trim will only happen once/sec until the high water
> mark is reached, instead of on every RPC. The tradeoff is that the DRC will
> become larger, but given memory sizes these days, that may be fine for you.

It will be a while before I have another server that isn't in
production (it's on my deployment plan, but getting the production
servers going is taking first priority).

The approaches that I was going to look at:

Simplest: only do the cache trim once every N requests (for some
reasonable value of N, e.g., 1000).  Maybe keep track of the number of
entries in each hash bucket and ignore those buckets that only have
one entry even if is stale.

Simple: just use a sepatate mutex for each list that a cache entry
is on, rather than a global lock for everything.  This would reduce
the mutex contention, but I'm not sure how significantly since I
don't have the means to measure it yet.

Moderately complicated: figure out if a different synchronization type
can safely be used (e.g., rmlock instead of mutex) and do so.

More complicated: move all cache trimming to a separate thread and
just have the rest of the code wake it up when the cache is getting
too big (or just once a second since that's easy to implement).  Maybe
just move all cache processing to a separate thread.

It's pretty clear from the profile that the cache mutex is heavily
contended, so anything that reduces the length of time it's held is
probably a win.

That URL again, for the benefit of people on freebsd-fs who didn't see
it on hackers, is:

>> <http://people.csail.mit.edu/wollman/nfs-server.unhalted-core-cycles.png>.

(This graph is slightly modified from my previous post as I removed
some spurious edges to make the formatting look better.  Still looking
for a way to get a profile that includes all kernel modules with the
kernel.)

-GAWollman



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20587.47363.504969.926603>