From owner-freebsd-fs@FreeBSD.ORG Mon Jul 25 21:58:36 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D9EA0106566C for ; Mon, 25 Jul 2011 21:58:36 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 949228FC14 for ; Mon, 25 Jul 2011 21:58:36 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap4EADPmLU6DaFvO/2dsb2JhbAA0AQEEASlPDQUYGAICDSUCFlEHhG2jfIh8r2qRFoErhAWBDwSScIgxiEs X-IronPort-AV: E=Sophos;i="4.67,265,1309752000"; d="scan'208";a="128529439" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 25 Jul 2011 17:58:35 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id ECC5DB3F3C; Mon, 25 Jul 2011 17:58:35 -0400 (EDT) Date: Mon, 25 Jul 2011 17:58:35 -0400 (EDT) From: Rick Macklem To: Zack Kirsch Message-ID: <957583241.989932.1311631115955.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <476FC2247D6C7843A4814ED64344560C04443EAA@seaxch10.desktop.isilon.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: freebsd-fs@freebsd.org Subject: Re: nfsd server cache flooded, try to increase nfsrc_floodlevel X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Jul 2011 21:58:36 -0000 Zack Kirsch wrote: > Just wanted to add a bit of Isilon color. We've hit this limit before, > but I believe it was mostly due to strange client behavior of 1) Using > a new lockowner for each lock and 2) Using a new TCP connection for > each 'test run'. When I saw this before, I remarked that this shouldn't be relevant. I realize now that you were referring to a test environment (not a real NFS client) where it keeps creating new TCP connections, even if the previous connection wasn't broken due to a network partitioning or similar. Sorry about that. > As far as I know, we haven't hit this in the field. > It appears that this case was a result of use of an old Linux NFSv4 client and was resolved via a kernel upgrade. (ie. I suspect there are others out there that will run into the same thing sooner or later.) > We've done a few things to combat this problem: > 1) We increased the floodlevel to 65536. > 2) We made the floodlevel configurable via sysctl. > 3) We made significant changes to the replay cache itself. Specific > gains were drastic performance improvements and freeing of cache > entries from stale TCP connections. > It is important to note that the request cache holds onto replies for inactive TCP connections because it assumes that the client might be network partitioned for long enough that it is forced to reconnect using a fresh TCP connection and will then retry all outstanding RPCs. This could take a looonnngggg time to happen, so these replies can't be free'd quickly, or the whole purpose of the cache (avoiding redoing non-idempotent operations when an RPC is retried) is defeated. The fact that some artificial test program (pynfs maybe?) chooses to do fresh TCP connections isn't relevant imho, since it isn't a real client and, as far as I know, real clients only reconnect when the old TCP connection no longer works. I thought I'd try and clarify this for anyone interested, rick