From owner-freebsd-stable@FreeBSD.ORG Wed Sep 11 11:54:17 2013 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id C4B82B79 for ; Wed, 11 Sep 2013 11:54:17 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 8C3C5274A for ; Wed, 11 Sep 2013 11:54:17 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqIEAMpYMFKDaFve/2dsb2JhbABbFoMpUoMqvzaBMnSCJQEBBSNWGxgCAg0ZAlkGE4gCDLJbkWmBKY4NNAeCaYE0A5kokEODPiCBbg X-IronPort-AV: E=Sophos;i="4.90,883,1371096000"; d="scan'208";a="51102365" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 11 Sep 2013 07:54:10 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id A39F1B3F49; Wed, 11 Sep 2013 07:54:10 -0400 (EDT) Date: Wed, 11 Sep 2013 07:54:10 -0400 (EDT) From: Rick Macklem To: Lars Eggert Message-ID: <995078453.21651811.1378900450650.JavaMail.root@uoguelph.ca> In-Reply-To: <404C967C-4BFB-424C-A5E0-6ACE6576249C@netapp.com> Subject: Re: nfsd CPU usage? MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Sep 2013 11:54:17 -0000 Lars Eggert wrote: > Hi, > > I'm seeing extremely high CPU usage withssh-st the new nfsd: > > PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU > COMMAND > 2280 root 102 0 9932K 1376K *nfs_c 0 320:11 100.00% > nfsd{nfsd: service} > 2280 root 102 0 9932K 1376K CPU7 7 319:47 100.00% > nfsd{nfsd: service} > 2280 root 102 0 9932K 1376K CPU5 5 318:25 100.00% > nfsd{nfsd: service} > 2280 root 102 0 9932K 1376K CPU6 6 318:20 100.00% > nfsd{nfsd: service} > 2280 root 52 0 9932K 1376K CPU0 0 317:32 100.00% > nfsd{nfsd: service} > 2280 root 102 0 9932K 1376K *nfs_c 1 315:41 99.17% > nfsd{nfsd: service} > 2280 root 52 0 9932K 1376K *nfs_c 4 320:22 98.78% > nfsd{nfsd: master} > 2280 root 102 0 9932K 1376K *nfs_c 1 317:10 98.10% > nfsd{nfsd: service} > > And this is at a few hundred KB/s with only a few clients: > > ifstat -i igb1 10 > igb1 > KB/s in KB/s out > 796.56 208.66 > 431.19 232.36 > 316.11 280.31 > 1005.96 523.42 > 1077.74 342.25 > 340.63 217.73 > 1067.96 330.56 > 487.91 235.61 > > Any ideas? > > FreeBSD stanley.muccbc.hq.netapp.com 9.2-PRERELEASE FreeBSD > 9.2-PRERELEASE #7: Wed Sep 4 11:06:31 CEST 2013 > root@stanley.muccbc.hq.netapp.com:/usr/obj/usr/src/sys/STANLEY > amd64 > > Thanks, > Lars > There is a patch in head (r254337) that I believe handles this. It will be MFC'd to stable/9 in about a week, unless someone finds problems with it before then. If you want a semantically equivalent (but uglier code) patch, you can find it here: http://people.freebsd.org/~rmacklem/drc4-stable9.patch After applying the patch, you need to set sysctl variable(s), to avoid the aggressive trimming of stale DRC entries. Garrett Wollman suggests the following for a large server: vfs.nfsd.tcphighwater=100000 vfs.nfsd.tcpcachetimeout=300 (5 minutes instead of default of several hrs) You can also use the sysctl vfs.nfsd.cachetcp=0 to disable use of the DRC for TCP. The old nfs server did not use the DRC for TCP. The assumption being that TCP layer retransmits are good enough to maintain reliable RPC transport. Unfortuantely, you can get file corruption when the server reboots or there is a network partitioning, if the client chooses to redo the RPC over TCP (clients always do this after having to create a new TCP connection). In other words, vfs.nfsd.cachetcp=0 is roughly what the old nfsd did. If you don't want to patch the 9.2 code, you can edit the sources (sys/fs/nfsserver/nfs_nfsdcache.c) and change the line: static int nfsrc_tcpnonidempotent = 1; to static int nfsrc_tcpnonidempotent = 0; to do the same thing as vfs.nfsd.cachetcp=0 rick