From owner-freebsd-fs@FreeBSD.ORG Fri Oct 26 22:47:55 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9013D752 for ; Fri, 26 Oct 2012 22:47:55 +0000 (UTC) (envelope-from tom@claimlynx.com) Received: from na3sys009aog137.obsmtp.com (na3sys009aog137.obsmtp.com [74.125.149.18]) by mx1.freebsd.org (Postfix) with SMTP id E87DF8FC08 for ; Fri, 26 Oct 2012 22:47:54 +0000 (UTC) Received: from mail-yh0-f72.google.com ([209.85.213.72]) (using TLSv1) by na3sys009aob137.postini.com ([74.125.148.12]) with SMTP ID DSNKUIsTFKAJrqphhTehYwJ9gVutSqS59Pf6@postini.com; Fri, 26 Oct 2012 15:47:55 PDT Received: by mail-yh0-f72.google.com with SMTP id q46so5911060yhf.7 for ; Fri, 26 Oct 2012 15:47:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:x-gm-message-state; bh=OoPLAmyor9BTVs+yWuF1FgWvpUP0UdcpTxUPIZ+5Rcc=; b=RpDgpKpnB1GdyOSz4uH6bbWaNfR3ELgVl3aXljKucvY1RQh0ZYpk02Ryc7s4RRfZ2+ eftJ60QdeQ4vp0jE7xC/RPWV6oproXhSyRRLv4j05ZZwmfuglnQ4SztXNYSWViRSF6HO lEV2VZdLu5qC6C+U6RQ46WEA0QnmoZdyoPInKjTOmfx0QyMcVELsCjJCnduWg3m3FE5l SjnrRrKU5+x62QcuWcOdgAo93aqMXnt1K+b1CJji83m2kZxMXUMA0RlTEg8Vs4yYd3Ks 3djxcOGap0rH7TIwoX4OWORWH0lAwCFE597+Rf4ptWyzzpgiru65+uTmG8GNvvVepn/m mYVw== Received: by 10.52.26.133 with SMTP id l5mr31658855vdg.132.1351291667957; Fri, 26 Oct 2012 15:47:47 -0700 (PDT) MIME-Version: 1.0 Received: by 10.52.26.133 with SMTP id l5mr31658840vdg.132.1351291667675; Fri, 26 Oct 2012 15:47:47 -0700 (PDT) Received: by 10.58.28.138 with HTTP; Fri, 26 Oct 2012 15:47:47 -0700 (PDT) In-Reply-To: <86699361.2739800.1351035439228.JavaMail.root@erie.cs.uoguelph.ca> References: <86699361.2739800.1351035439228.JavaMail.root@erie.cs.uoguelph.ca> Date: Fri, 26 Oct 2012 17:47:47 -0500 Message-ID: Subject: Re: Poor throughput using new NFS client (9.0) vs. old (8.2/9.0) From: Thomas Johnson To: Rick Macklem X-Gm-Message-State: ALoCoQkzxDBxX+N8o2inXQJs4pkoKqgMlRO+BUxfbBMcKm/ezqTNr8fve0f+ebul3DUr8wZUlwX+mkIF6jM7KTlCRCW7iXyfFYvSbS1lS8Zkd0Ma/VEWPHohTkZ4aCiecQApW6G0cscq5Pc3OI11M4MXOIQCCAkutA== Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Oct 2012 22:47:55 -0000 You are exactly correct. I went back to the logs, apparently when we tried changing the newnfs wsize/rsize parameters we _changed_ them to 64k (derp). Further benchmarking indicates that with newnfs, we see the best performance at 16k and 32k; 8k also performs quite well. 9.0 vs. 9.1 seems very close as well, though it is difficult to draw conclusions from a busy production system. Good to know that this is a case of PEBKAC, rather than an actual problem. Thanks to everyone for the assistance! On Tue, Oct 23, 2012 at 6:37 PM, Rick Macklem wrote: > Thomas Johnson wrote: > > I built a test image based on 9.1-rc2, per your suggestion Rick. The > > results are below. I was not able to exactly reproduce the workload in > > my original message, so I have also included results for the new (very > > similar) workload on my 9.0 client image as well. > > > > To summarize, 9.1-rc2 using newnfs seems to perform better than > > 9.0-p4, but oldnfs appears to still be significantly faster in both > > cases. > > > > I will get packet traces to Rick, but I want to get new results to the > > list. > > > > -Tom > > > > root@test:/test-> uname -a > > FreeBSD test.claimlynx.com 9.1-RC2 FreeBSD 9.1-RC2 #1: Fri Oct 19 > > 08:27:12 CDT 2012 > > root@builder.claimlynx.com:/usr/obj/usr/src/sys/GENERIC amd64 > > > > > > root@test:/-> mount | grep test > > server:/array/test on /test (nfs) > > root@test:/test-> zip BIGGER_PILE.zip BIG_PILE_53* > > adding: BIG_PILE_5306.zip (stored 0%) > > adding: BIG_PILE_5378.zip (stored 0%) > > adding: BIG_PILE_5386.zip (stored 0%) > > root@test:/test-> ll -h BIGGER_PILE.zip > > -rw-rw-r-- 1 root claimlynx 5.5M Oct 23 14:05 BIGGER_PILE.zip > > root@test:/test-> time zip BIGGER_PILE.zip 53*.zip > /dev/null > > 0.664u 1.693s 0:30.21 7.7% 296+3084k 0+2926io 0pf+0w > > 0.726u 0.989s 0:08.04 21.1% 230+2667k 0+2956io 0pf+0w > > 0.829u 1.268s 0:11.89 17.4% 304+3037k 0+2961io 0pf+0w > > 0.807u 0.902s 0:08.02 21.1% 233+2676k 0+2947io 0pf+0w > > 0.753u 1.354s 0:12.73 16.4% 279+2879k 0+2947io 0pf+0w > > root@test:/test-> ll -h BIGGER_PILE.zip > > -rw-rw-r-- 1 root claimlynx 89M Oct 23 14:03 BIGGER_PILE.zip > > > Although the runs take much longer (I have no idea why and hopefully > I can spot something in the packet traces), it shows about half the > I/O ops. This suggests that it is running at the 64K rsize, wsize > instead of the 32K used by the old client. > > Just to confirm. Did you run a test using the new nfs client > with rsize=32768,wsize=32768 mount options, so the I/O size is > the same as with the old client? > > rick > > > > > root@test:/test-> mount | grep test > > server:/array/test on /test (oldnfs) > > root@test:/test-> time zip BIGGER_PILE.zip 53*.zip > /dev/null > > 0.645u 1.435s 0:08.05 25.7% 295+3044k 0+5299io 0pf+0w > > 0.783u 0.993s 0:06.48 27.3% 225+2499k 0+5320io 0pf+0w > > 0.787u 1.000s 0:06.28 28.3% 246+2884k 0+5317io 0pf+0w > > 0.707u 1.392s 0:07.94 26.3% 266+2743k 0+5313io 0pf+0w > > 0.709u 1.056s 0:06.08 28.7% 246+2814k 0+5318io 0pf+0w > > > > > > > > root@test:/home/tom-> uname -a > > FreeBSD test.claimlynx.com 9.0-RELEASE-p4 FreeBSD 9.0-RELEASE-p4 #0: > > Tue Sep 18 11:51:11 CDT 2012 > > root@builder.claimlynx.com:/usr/obj/usr/src/sys/GENERIC amd64 > > > > > > root@test:/test-> mount | grep test > > server:/array/test on /test (nfs) > > root@test:/test-> time zip BIGGER_PILE.zip 53*.zip > /dev/null > > 0.721u 1.819s 0:31.13 8.1% 284+2886k 0+2932io 0pf+0w > > 0.725u 1.386s 0:12.84 16.3% 247+2631k 0+2957io 0pf+0w > > 0.675u 1.392s 0:13.94 14.7% 300+3005k 0+2928io 0pf+0w > > 0.705u 1.206s 0:10.72 17.7% 278+2874k 0+2973io 0pf+0w > > 0.727u 1.200s 0:18.28 10.5% 274+2872k 0+2947io 0pf+0w > > > > > > root@test:/-> umount /test > > root@test:/-> mount -t oldnfs server:/array/test /test > > root@test:/-> mount | grep test > > server:/array/test on /test (oldnfs) > > root@test:/test-> time zip BIGGER_PILE.zip 53*.zip > /dev/null > > 0.694u 1.820s 0:10.82 23.1% 271+2964k 0+5320io 0pf+0w > > 0.726u 1.293s 0:06.37 31.5% 303+2998k 0+5322io 0pf+0w > > 0.717u 1.248s 0:06.08 32.0% 246+2607k 0+5354io 0pf+0w > > 0.733u 1.230s 0:06.17 31.7% 256+2536k 0+5311io 0pf+0w > > 0.549u 1.581s 0:08.02 26.4% 302+3116k 0+5321io 0pf+0w > > > > > > On Thu, Oct 18, 2012 at 5:11 PM, Rick Macklem < rmacklem@uoguelph.ca > > > wrote: > > > > > > > > > > Ronald Klop wrote: > > > On Thu, 18 Oct 2012 18:16:16 +0200, Thomas Johnson < > > > tom@claimlynx.com > > > > wrote: > > > > > > > We recently upgraded a number of hosts from FreeBSD 8.2 to 9.0. > > > > Almost > > > > immediately, we received reports from users of poor performance. > > > > The > > > > upgraded hosts are PXE-booted, with an NFS-mounted root. > > > > Additionally, > > > > they > > > > mount a number of other NFS shares, which is where our users work > > > > from. > > > > After a week of tweaking rsize/wsize/readahead parameters (per > > > > guidance), > > > > it finally occurred to me that 9.0 defaults to the new NFS client > > > > and > > > > server. I remounted the user shares using the oldnfs file type, > > > > and > > > > users > > > > reported that performance returned to its expected level. > > > > > > > > This is obviously a workaround, rather than a solution. We would > > > > prefer > > > > to > > > > get our hosts using the newnfs client, since presumably oldnfs > > > > will > > > > be > > > > deprecated at some point in the future. Is there some change that > > > > we > > > > should > > > > have made to our NFS configuration with the upgrade to 9.0, or is > > > > it > > > > possible that our workload is exposing some deficiency with > > > > newnfs? > > > > We > > > > tend > > > > to deal with a huge number of tiny files (several KB in size). The > > > > NFS > > > > server has been running 9.0 for some time (prior to the client > > > > upgrade) > > > > without any issue. NFS is served from a zpool, backed by a Dell > > > > MD3000, > > > > populated with 15k SAS disks. Clients and server are connected > > > > with > > > > Gig-E > > > > links. The general hardware configuration has not changed in > > > > nearly > > > > 3 > > > > years. > > > > > > > > As an example of the performance difference, here is some of the > > > > testing > > > > I > > > > did while troubleshooting. Given a directory containing 5671 zip > > > > files, > > > > with an average size of 15KB. I append all files to an existing > > > > zip > > > > file. > > > > Using the newnfs mount, I found that this operation generally > > > > takes > > > > ~30 > > > > seconds (wall time). Switching the mount to oldnfs resulted in the > > > > same > > > > operation taking ~10 seconds. > > > > > > > > tom@test-1:/test-> ls 53*zip | wc -l > > > > 5671 > > > > tom@test-1:/test-> ll -h BIG* > > > > -rw-rw-r-- 1 tom claimlynx 8.9M Oct 17 14:06 BIGGER_PILE_1.zip > > > > tom@test-1:/test-> time zip BIGGER_PILE_1.zip 53*.zip > > > > 0.646u 0.826s 0:51.01 2.8% 199+2227k 0+2769io 0pf+0w > > > > ...reset and repeat... > > > > 0.501u 0.629s 0:30.49 3.6% 208+2319k 0+2772io 0pf+0w > > > > ...reset and repeat... > > > > 0.601u 0.522s 0:32.37 3.4% 220+2406k 0+2771io 0pf+0w > > > > > > > > tom@test-1:/-> cd / > > > > tom@test-1:/-> sudo umount /test > > > > tom@test-1:/-> sudo mount -t oldnfs -o rw server:/array/test /test > > > > tom@test-1:/-> mount | grep test > > > > server:/array/test on /test (oldnfs) > > > > tom@test-1:/-> cd /test > > > > ...reset and repeat... > > > > 0.470u 0.903s 0:13.09 10.4% 203+2229k 0+5107io 0pf+0w > > > > ...reset and repeat... > > > > 0.547u 0.640s 0:08.65 13.6% 231+2493k 0+5086io 0pf+0w > > > > tom@test-1:/test-> ll -h BIG* > > > > -rw-rw-r-- 1 tom claimlynx 92M Oct 17 14:14 BIGGER_PILE_1.zip > > > > > > > > Thanks! > > > > > > > > > > > > > You might find this thread from today interesting. > > > http://lists.freebsd.org/pipermail/freebsd-fs/2012-October/015441.html > > > > > Yes, although I can't explain why Alexey's problem went away > > when he went from 9.0->9.1 for his NFS server, it would be > > interesting if Thomas could try the same thing? > > > > About the only thing different between the old and new NFS > > clients is the default rsize/wsize. However, if Thomas tried > > rsize=32768,wsize=32768 for the default (new) NFS client, then > > that would be ruled out. To be honest, the new client uses code > > cloned from the old one for all the caching etc (which is where > > the clients are "smart"). They use different RPC parsing code, > > since the new one does NFSv4 as well, but that code is pretty > > straightforward, so I can't think why it would result in a > > factor of 3 in performance. > > > > If Thomas were to capture a packet trace of the above test > > for two clients and emailed them to me, I could take a look > > and see if I can see what is going on. (For Alexey's case, > > it was a whole bunch of Read RPCs without replies, but that > > was a Linux client, of course. It also had a significant # of > > TCP layer retransmits and out of order TCP segments in it.) > > > > It would be nice to figure this out, since I was thinking > > that the old client might go away for 10.0 (can't if these > > issues still exist). >