From owner-freebsd-net@FreeBSD.ORG  Sun Jan 19 23:36:19 2014
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 72D7F1C1
 for <freebsd-net@freebsd.org>; Sun, 19 Jan 2014 23:36:19 +0000 (UTC)
Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca
 [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id ECE1A16CA
 for <freebsd-net@freebsd.org>; Sun, 19 Jan 2014 23:36:18 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: 
X-IronPort-AV: E=Sophos;i="4.95,687,1384318800"; d="scan'208";a="89050440"
Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.222])
 by esa-jnhn.mail.uoguelph.ca with ESMTP; 19 Jan 2014 18:36:17 -0500
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 8D306B40D0;
 Sun, 19 Jan 2014 18:36:17 -0500 (EST)
Date: Sun, 19 Jan 2014 18:36:17 -0500 (EST)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Adam McDougall <mcdouga9@egr.msu.edu>
Message-ID: <1349281953.12559529.1390174577569.JavaMail.root@uoguelph.ca>
In-Reply-To: <52DC1241.7010004@egr.msu.edu>
Subject: Re: Terrible NFS performance under 9.2-RELEASE?
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.209]
X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790)
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 19 Jan 2014 23:36:19 -0000

Adam McDougall wrote:
> Also try rsize=32768,wsize=32768 in your mount options, made a huge
> difference for me.  I've noticed slow file transfers on NFS in 9 and
> finally did some searching a couple months ago, someone suggested it
> and
> they were on to something.
> 
Yes, it shouldn't make a big difference but it sometimes does. When it
does, I believe that indicates there is a problem with your network
fabric. The problem might be TSO for segments near 64K in size or some
network device that can't handle the larger burst of received packets.
(Slight differences could be related to vm issues related to fragmentation
 caused by the larger mapped buffer cache blocks, but I'm pretty sure
 this wouldn't cause a massive difference.)

rick

> On 01/19/2014 09:32, Alfred Perlstein wrote:
> > 9.x has pretty poor mbuf tuning by default.
> > 
> > I hit nearly the same problem and raising the mbufs worked for me.
> > 
> > I'd suggest raising that and retrying.
> > 
> > -Alfred
> > 
> > On 1/19/14 12:47 AM, J David wrote:
> >> While setting up a test for other purposes, I noticed some really
> >> horrible NFS performance issues.
> >>
> >> To explore this, I set up a test environment with two FreeBSD
> >> 9.2-RELEASE-p3 virtual machines running under KVM.  The NFS server
> >> is
> >> configured to serve a 2 gig mfs on /mnt.
> >>
> >> The performance of the virtual network is outstanding:
> >>
> >> Server:
> >>
> >> $ iperf -c 172.20.20.169
> >>
> >> ------------------------------------------------------------
> >>
> >> Client connecting to 172.20.20.169, TCP port 5001
> >>
> >> TCP window size: 1.00 MByte (default)
> >>
> >> ------------------------------------------------------------
> >>
> >> [  3] local 172.20.20.162 port 59717 connected with 172.20.20.169
> >> port
> >> 5001
> >>
> >> [ ID] Interval       Transfer     Bandwidth
> >>
> >> [  3]  0.0-10.0 sec  16.1 GBytes  13.8 Gbits/sec
> >>
> >> $ iperf -s
> >>
> >> ------------------------------------------------------------
> >>
> >> Server listening on TCP port 5001
> >>
> >> TCP window size: 1.00 MByte (default)
> >>
> >> ------------------------------------------------------------
> >>
> >> [  4] local 172.20.20.162 port 5001 connected with 172.20.20.169
> >> port
> >> 45655
> >>
> >> [ ID] Interval       Transfer     Bandwidth
> >>
> >> [  4]  0.0-10.0 sec  15.8 GBytes  13.6 Gbits/sec
> >>
> >>
> >> Client:
> >>
> >>
> >> $ iperf -s
> >>
> >> ------------------------------------------------------------
> >>
> >> Server listening on TCP port 5001
> >>
> >> TCP window size: 1.00 MByte (default)
> >>
> >> ------------------------------------------------------------
> >>
> >> [  4] local 172.20.20.169 port 5001 connected with 172.20.20.162
> >> port
> >> 59717
> >>
> >> [ ID] Interval       Transfer     Bandwidth
> >>
> >> [  4]  0.0-10.0 sec  16.1 GBytes  13.8 Gbits/sec
> >>
> >> ^C$ iperf -c 172.20.20.162
> >>
> >> ------------------------------------------------------------
> >>
> >> Client connecting to 172.20.20.162, TCP port 5001
> >>
> >> TCP window size: 1.00 MByte (default)
> >>
> >> ------------------------------------------------------------
> >>
> >> [  3] local 172.20.20.169 port 45655 connected with 172.20.20.162
> >> port
> >> 5001
> >>
> >> [ ID] Interval       Transfer     Bandwidth
> >>
> >> [  3]  0.0-10.0 sec  15.8 GBytes  13.6 Gbits/sec
> >>
> >>
> >> The performance of the mfs filesystem on the server is also good.
> >>
> >> Server:
> >>
> >> $ sudo mdconfig -a -t swap -s 2g
> >>
> >> md0
> >>
> >> $ sudo newfs -U -b 4k -f 4k /dev/md0
> >>
> >> /dev/md0: 2048.0MB (4194304 sectors) block size 4096, fragment
> >> size 4096
> >>
> >> using 43 cylinder groups of 48.12MB, 12320 blks, 6160 inodes.
> >>
> >> with soft updates
> >>
> >> super-block backups (for fsck_ffs -b #) at:
> >>
> >>   144, 98704, 197264, 295824, 394384, 492944, 591504, 690064,
> >>   788624,
> >> 887184,
> >>
> >>   985744, 1084304, 1182864, 1281424, 1379984, 1478544, 1577104,
> >>   1675664,
> >>
> >>   1774224, 1872784, 1971344, 2069904, 2168464, 2267024, 2365584,
> >>   2464144,
> >>
> >>   2562704, 2661264, 2759824, 2858384, 2956944, 3055504, 3154064,
> >>   3252624,
> >>
> >>   3351184, 3449744, 3548304, 3646864, 3745424, 3843984, 3942544,
> >>   4041104,
> >>
> >>   4139664
> >>
> >> $ sudo mount /dev/md0 /mnt
> >>
> >> $ cd /mnt
> >>
> >> $ sudo iozone -e -I -s 512m -r 4k -i 0 -i 1 -i 2
> >>
> >> Iozone: Performance Test of File I/O
> >>
> >>          Version $Revision: 3.420 $
> >>
> >> [...]
> >>
> >>                                                              random
> >> random
> >>
> >>                KB  reclen   write rewrite    read    reread
> >>                   read
> >> write
> >>
> >>            524288       4  560145 1114593   933699   831902
> >>              56347
> >> 158904
> >>
> >>
> >> iozone test complete.
> >>
> >>
> >> But introduce NFS into the mix and everything falls apart.
> >>
> >> Client:
> >>
> >> $ sudo mount -o tcp,nfsv3 f12.phxi:/mnt /mnt
> >>
> >> $ cd /mnt
> >>
> >> $ sudo iozone -e -I -s 512m -r 4k -i 0 -i 1 -i 2
> >>
> >> Iozone: Performance Test of File I/O
> >>
> >>          Version $Revision: 3.420 $
> >>
> >> [...]
> >>
> >>                                                              random
> >> random
> >>
> >>                KB  reclen   write rewrite    read    reread
> >>                   read
> >> write
> >>
> >>            524288       4   67246    2923   103295  1272407
> >>             172475
> >> 196
> >>
> >>
> >> And the above took 48 minutes to run, compared to 14 seconds for
> >> the
> >> local version.  So it's 200x slower over NFS.  The random write
> >> test
> >> is over 800x slower.  Of course NFS is slower, that's expected,
> >> but it
> >> definitely wasn't this exaggerated in previous releases.
> >>
> >> To emphasize that iozone reflects real workloads here, I tried
> >> doing
> >> an svn co of the 9-STABLE source tree over NFS but after two hours
> >> it
> >> was still in llvm so I gave up.
> >>
> >> While all this not-much-of-anything NFS traffic is going on, both
> >> systems are essentially idle.  The process on the client sits in
> >> "newnfs" wait state with nearly no CPU.  The server is completely
> >> idle
> >> except for the occasional 0.10% in an nfsd thread, which otherwise
> >> spend their lives in rpcsvc wait state.
> >>
> >> Server iostat:
> >>
> >> $ iostat -x -w 10 md0
> >>
> >>                         extended device statistics
> >>
> >> device     r/s   w/s    kr/s    kw/s qlen svc_t  %b
> >>
> >> [...]
> >>
> >> md0        0.0  36.0     0.0     0.0    0   1.2   0
> >> md0        0.0  38.8     0.0     0.0    0   1.5   0
> >> md0        0.0  73.6     0.0     0.0    0   1.0   0
> >> md0        0.0  53.3     0.0     0.0    0   2.5   0
> >> md0        0.0  33.7     0.0     0.0    0   1.1   0
> >> md0        0.0  45.5     0.0     0.0    0   1.8   0
> >>
> >> Server nfsstat:
> >>
> >> $ nfsstat -s -w 10
> >>
> >>   GtAttr Lookup Rdlink   Read  Write Rename Access  Rddir
> >>
> >> [...]
> >>
> >>        0      0      0    471    816      0      0      0
> >>
> >>        0      0      0    480    751      0      0      0
> >>
> >>        0      0      0    481     36      0      0      0
> >>
> >>        0      0      0    469    550      0      0      0
> >>
> >>        0      0      0    485    814      0      0      0
> >>
> >>        0      0      0    467    503      0      0      0
> >>
> >>        0      0      0    473    345      0      0      0
> >>
> >>
> >> Client nfsstat:
> >>
> >> $ nfsstat -c -w 10
> >>
> >>   GtAttr Lookup Rdlink   Read  Write Rename Access  Rddir
> >>
> >> [...]
> >>
> >>        0      0      0      0    518      0      0      0
> >>
> >>        0      0      0      0    498      0      0      0
> >>
> >>        0      0      0      0    503      0      0      0
> >>
> >>        0      0      0      0    474      0      0      0
> >>
> >>        0      0      0      0    525      0      0      0
> >>
> >>        0      0      0      0    497      0      0      0
> >>
> >>
> >> Server vmstat:
> >>
> >> $ vmstat -w 10
> >>
> >>   procs      memory      page                    disks
> >> faults         cpu
> >>
> >>   r b w     avm    fre   flt  re  pi  po    fr  sr vt0 vt1   in
> >>     sy
> >> cs us sy id
> >>
> >> [...]
> >>
> >>   0 4 0    634M  6043M    37   0   0   0     1   0   0   0 1561
> >>     46
> >> 3431  0  2 98
> >>
> >>   0 4 0    640M  6042M    62   0   0   0    28   0   0   0 1598
> >>     94
> >> 3552  0  2 98
> >>
> >>   0 4 0    648M  6042M    38   0   0   0     0   0   0   0 1609
> >>     47
> >> 3485  0  1 99
> >>
> >>   0 4 0    648M  6042M    37   0   0   0     0   0   0   0 1615
> >>     46
> >> 3667  0  2 98
> >>
> >>   0 4 0    648M  6042M    37   0   0   0     0   0   0   0 1606
> >>     45
> >> 3678  0  2 98
> >>
> >>   0 4 0    648M  6042M    37   0   0   0     0   0   1   0 1561
> >>     45
> >> 3377  0  2 98
> >>
> >>
> >> Client vmstat:
> >>
> >> $ vmstat -w 10
> >>
> >>   procs      memory      page                    disks
> >> faults         cpu
> >>
> >>   r b w     avm    fre   flt  re  pi  po    fr  sr md0 da0   in
> >>     sy
> >> cs us sy id
> >>
> >> [...]
> >>
> >>   0 0 0    639M   593M    33   0   0   0  1237   0   0   0  281
> >>   5575
> >> 1043  0  3 97
> >>
> >>   0 0 0    639M   591M     0   0   0   0   712   0   0   0  235
> >>    122
> >> 889  0  2 98
> >>
> >>   0 0 0    639M   583M     0   0   0   0   571   0   0   1  227
> >>    120
> >> 851  0  2 98
> >>
> >>   0 0 0    639M   592M   198   0   0   0  1212   0   0   0  251
> >>   2497
> >> 950  0  3 97
> >>
> >>   0 0 0    639M   586M     0   0   0   0   614   0   0   0  250
> >>    121
> >> 924  0  2 98
> >>
> >>   0 0 0    639M   586M     0   0   0   0   765   0   0   0  250
> >>    120
> >> 918  0  3 97
> >>
> >>
> >> Top on the KVM host says it is 93-95% idle and that each VM sits
> >> around 7-10% CPU.  So basically nobody is doing anything.  There's
> >> no
> >> visible bottleneck, and I've no idea where to go from here to
> >> figure
> >> out what's going on.
> >>
> >> Does anyone have any suggestions for debugging this?
> >>
> >> Thanks!
> >> _______________________________________________
> >> freebsd-net@freebsd.org mailing list
> >> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> >> To unsubscribe, send any mail to
> >> "freebsd-net-unsubscribe@freebsd.org"
> >>
> > 
> > _______________________________________________
> > freebsd-net@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to
> > "freebsd-net-unsubscribe@freebsd.org"
> 
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to
> "freebsd-net-unsubscribe@freebsd.org"
>