From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 13:49:39 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 37CBD3DB for ; Thu, 29 Aug 2013 13:49:39 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id EF61028B1 for ; Thu, 29 Aug 2013 13:49:38 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqIEACpQH1KDaFve/2dsb2JhbABaFoMmUYMnvQGBQHSCJAEBBAEjVhsYAgINGQJZBhOHewYMpnOSK4EpjR94NAeCaIE0A5kikDeDPCCBLkA X-IronPort-AV: E=Sophos;i="4.89,983,1367985600"; d="scan'208";a="48059355" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 29 Aug 2013 09:49:37 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id C96FEB3F1B; Thu, 29 Aug 2013 09:49:37 -0400 (EDT) Date: Thu, 29 Aug 2013 09:49:37 -0400 (EDT) From: Rick Macklem To: "Sam Fourman Jr." Message-ID: <1079189088.15197172.1377784177805.JavaMail.root@uoguelph.ca> In-Reply-To: Subject: Re: NFS on ZFS pure SSD pool MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 13:49:39 -0000 Sam Fourman Jr wrote: > > > > > > > On Wed, Aug 28, 2013 at 2:27 PM, Eric Browning < > ericbrowning@skaggscatholiccenter.org > wrote: > > > Rick, > > Sam and I applied the patch (kernel now at r254983M) and set > vfs.nfsd.tcphighwater=5000 > in sysctl.conf and my CPU is still slammed. SHould I up it to 10000? > > > > > > > Hello, list > I am helping Eric debug and test this situation as much as I can. > > > So to clarify and recap, here is the situation: > > > > This is a production setting, in a school, that has 200+ students > using a mix of systems,with the primary client being OSX 10.8. > and the primary function is using NFS. > I haven't touched a Mac in several years, but I think Finder probes at regular intervals to see if directories (oops, I meant folders;-) have changed. I think there is a way to increase the interval time between probes. Also, I think there are tunables for the metadata cache in ZFS, which might be useful for increasing the metadata cache sizes, since the probes will be checking metadata (attributes). If this is contributing the the heavy load, I'd suspect "nfsstat -e -s" to show a large count for Getattr. (I vaguely remember that NFSv4 mounts were in use. The counts of everything is larger for NFSv4, since they are counts of operations and not RPCs. Each NFSv4 RPC is a compound made up of N operations.) Just something that might be worth looking at, rick > > from what I can see there should be plenty of disk I/O > > these are Intel SSD disks.. > > > The server is running FreeBSD 9-STABLE r254983 (we patched it last > night) > > with this patch > http://people.freebsd.org/~rmacklem/drc4-stable9.patch > > > > Here is a full dmesg for reference (it states FreeBSD 9.1,but we have > since > upgraded and applied the above patch) > > > https://gist.github.com/sfourman/6373059 > > > > > > > > > > The main problem is we need better performance from NFS, but > it would appear the server is starved for CPU cycles.... > > > With only a few clients the server is lightning fast but > with 25 users logging in this morning (students in class) the server > went right to 1200% CPU load > and about 3 00% more going to "intr" and it pretty much stayed there > all day until they logged out between classes. > > > So that works out to be somewhere between 2 to 4 users per core > I'm not the guy to be able to help with how to do it, put profiling the running kernel to try and see where the CPU is being used, could help. (At this point, I suspect it isn't in the nfs code, since the DRC seems to be the only CPU hog and I think the patch you are already using fixes that.) Good luck with it, rick > > > during today's classes, different settings for vfs.nfsd.tcphighwater > were tested > various ranges from 5,000 up to 50,000 were used while a load was > present, but > the processor load didn't change. > > > Garrett stated that he tried values in upwards of 100,000... this can > be tested tomorrow > > > > > It would be helpful if we could get some direction, on other things > we might try tomorrow. > > > one idea is, the server has several igb Ethernet interfaces with 8 > queue's per interface > is it worth forcing the interfaces down to one queue? > > > Is NFS even setup to understand multi queue network devices? or > doesn't it matter? > > > Any thoughts are appreciated -- > > Sam Fourman Jr. >