From owner-freebsd-fs@FreeBSD.ORG  Wed Aug 28 23:48:37 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id 140E48EC
 for <freebsd-fs@freebsd.org>; Wed, 28 Aug 2013 23:48:37 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca
 [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id CE7B92168
 for <freebsd-fs@freebsd.org>; Wed, 28 Aug 2013 23:48:36 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AqAEAEmLHlKDaFve/2dsb2JhbABahA2DJ7x+gTh0giQBAQUjVhsYAgINGQJZBhOIAacUkiuBKYx9gQo0B4JogTEDlBeVO4M8IIEtQQ
X-IronPort-AV: E=Sophos;i="4.89,978,1367985600"; d="scan'208";a="47227100"
Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.222])
 by esa-annu.net.uoguelph.ca with ESMTP; 28 Aug 2013 19:48:29 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id A1A53B3FAE;
 Wed, 28 Aug 2013 19:48:29 -0400 (EDT)
Date: Wed, 28 Aug 2013 19:48:29 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Garrett Wollman <wollman@bimajority.org>
Message-ID: <461209820.15034260.1377733709648.JavaMail.root@uoguelph.ca>
In-Reply-To: <21022.29128.557471.157078@hergotha.csail.mit.edu>
Subject: Re: NFS on ZFS pure SSD pool
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.203]
X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Aug 2013 23:48:37 -0000

Garrett Wollman wrote:
> <<On Wed, 28 Aug 2013 16:49:43 -0400 (EDT), Rick Macklem
> <rmacklem@uoguelph.ca> said:
> 
> > Eric Browning wrote:
> >> Sam and I applied the patch (kernel now at r254983M) and set
> >> vfs.nfsd.tcphighwater=5000 in sysctl.conf and my CPU is still
> >> slammed. SHould I up it to 10000?
> >> 
> > You can try. I have no insight into where this goes, since I can't
> > produce the kind of server/load where it makes any difference. (I
> > have
> > single core i386 (P4 or similar) to test with and I don't use ZFS
> > at all.)
> > I've cc'd Garrett Wollman, since he runs rather large servers and
> > may
> > have some insight into appropriate tuning, etc.
> 
> 10,000 is probably way too small.  We run high-peformance servers
> with
> vfs.nfsd.tcphighwater set between 100k and 150k, and we crank
> vfs.nfsd.tcpcachetimeo down to five minutes or less.
> 
> Just to give you an idea of how rarely this cache is actually hit: my
> two main production file servers have both been up for about three
> months now, and have answered billions of requests (enough for the
> 32-bit signed statistics counters to wrap).  One server shows 63
> hits,
> with a peak TCP cache size of 150k and the other shows zero, with a
> peak cache size of 64k.  Another server, which serves scratch space,
> has been up for a little more than a month, and in nearly two billion
> accesses has yet to see a single cache hit (peak cache size 131k,
> which was actually hitting the configured limit, which I've since
> raised).
> 
> -GAWollman
> 
Yes. The cache is only hit if a client is network partitioned for
long enough that it does an RPC retry over TCP. Most clients only
do this now (this behaviour is required for NFSv4) when the client
establishes a new TCP connection after giving up on the old one.
(How quickly this occurs will depend on the client, but I am not
 surprised it is rare in a well maintained LAN environment.)
You should get your users to do their mounts over flaky WiFi links
and such, in order to make better use of the cache;-)

By the way Garrett, what do you have kern.ipc.nmbclusters set to,
since cache entries will use mbuf clusters normally.

And Garrett, thanks for your input, rick