From owner-freebsd-net@FreeBSD.ORG Sun Jan 19 08:47:27 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0020FDB2; Sun, 19 Jan 2014 08:47:26 +0000 (UTC) Received: from mail-ig0-x231.google.com (mail-ig0-x231.google.com [IPv6:2607:f8b0:4001:c05::231]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id AC31D1A4D; Sun, 19 Jan 2014 08:47:26 +0000 (UTC) Received: by mail-ig0-f177.google.com with SMTP id k19so5407855igc.4 for ; Sun, 19 Jan 2014 00:47:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:date:message-id:subject:from:to:content-type; bh=FO9F9f0F43wS5W9s6/ygoOysOku2id76mo8IpY4Eflc=; b=XABKLYA4s1oYL5/wclgqsj6fogtiwMpxzBlScFgwuBAZU7+u0IUr8WutYYbOly7TUn QlrlvGR8+fucRnl/DGng/91YErTQ4SVUsGFqgDyls62UvI1K4xvX+Zr2H9p+TRj46fL6 ULvRrzA6gOAc5KiRw6sK9O2NycuiNI8RIGUxY3TyiKDyH99CNmmtXqGQAyyXotuQOmpZ jWC281BwkyncuEvq/M7NjvmNcYeTczS4TyqroIwj9ioPPyCEm7SYmYR7tLfviAK4yt+5 v9+xEDt0rz8SUamoX0jw/caIJDtDmRzUAUpw2RtkySYjKm4hjS96FOp3UZYqDJkpWmEN rA2w== MIME-Version: 1.0 X-Received: by 10.51.17.101 with SMTP id gd5mr6719485igd.25.1390121245799; Sun, 19 Jan 2014 00:47:25 -0800 (PST) Sender: jdavidlists@gmail.com Received: by 10.42.170.8 with HTTP; Sun, 19 Jan 2014 00:47:25 -0800 (PST) Date: Sun, 19 Jan 2014 03:47:25 -0500 X-Google-Sender-Auth: us4bPXAWxvueZzYlem9wY1c_NbE Message-ID: Subject: Terrible NFS performance under 9.2-RELEASE? From: J David To: freebsd-net@freebsd.org, freebsd-stable , freebsd-virtualization@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Jan 2014 08:47:27 -0000 While setting up a test for other purposes, I noticed some really horrible NFS performance issues. To explore this, I set up a test environment with two FreeBSD 9.2-RELEASE-p3 virtual machines running under KVM. The NFS server is configured to serve a 2 gig mfs on /mnt. The performance of the virtual network is outstanding: Server: $ iperf -c 172.20.20.169 ------------------------------------------------------------ Client connecting to 172.20.20.169, TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 3] local 172.20.20.162 port 59717 connected with 172.20.20.169 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 16.1 GBytes 13.8 Gbits/sec $ iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 4] local 172.20.20.162 port 5001 connected with 172.20.20.169 port 45655 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.0 sec 15.8 GBytes 13.6 Gbits/sec Client: $ iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 4] local 172.20.20.169 port 5001 connected with 172.20.20.162 port 59717 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.0 sec 16.1 GBytes 13.8 Gbits/sec ^C$ iperf -c 172.20.20.162 ------------------------------------------------------------ Client connecting to 172.20.20.162, TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 3] local 172.20.20.169 port 45655 connected with 172.20.20.162 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 15.8 GBytes 13.6 Gbits/sec The performance of the mfs filesystem on the server is also good. Server: $ sudo mdconfig -a -t swap -s 2g md0 $ sudo newfs -U -b 4k -f 4k /dev/md0 /dev/md0: 2048.0MB (4194304 sectors) block size 4096, fragment size 4096 using 43 cylinder groups of 48.12MB, 12320 blks, 6160 inodes. with soft updates super-block backups (for fsck_ffs -b #) at: 144, 98704, 197264, 295824, 394384, 492944, 591504, 690064, 788624, 887184, 985744, 1084304, 1182864, 1281424, 1379984, 1478544, 1577104, 1675664, 1774224, 1872784, 1971344, 2069904, 2168464, 2267024, 2365584, 2464144, 2562704, 2661264, 2759824, 2858384, 2956944, 3055504, 3154064, 3252624, 3351184, 3449744, 3548304, 3646864, 3745424, 3843984, 3942544, 4041104, 4139664 $ sudo mount /dev/md0 /mnt $ cd /mnt $ sudo iozone -e -I -s 512m -r 4k -i 0 -i 1 -i 2 Iozone: Performance Test of File I/O Version $Revision: 3.420 $ [...] random random KB reclen write rewrite read reread read write 524288 4 560145 1114593 933699 831902 56347 158904 iozone test complete. But introduce NFS into the mix and everything falls apart. Client: $ sudo mount -o tcp,nfsv3 f12.phxi:/mnt /mnt $ cd /mnt $ sudo iozone -e -I -s 512m -r 4k -i 0 -i 1 -i 2 Iozone: Performance Test of File I/O Version $Revision: 3.420 $ [...] random random KB reclen write rewrite read reread read write 524288 4 67246 2923 103295 1272407 172475 196 And the above took 48 minutes to run, compared to 14 seconds for the local version. So it's 200x slower over NFS. The random write test is over 800x slower. Of course NFS is slower, that's expected, but it definitely wasn't this exaggerated in previous releases. To emphasize that iozone reflects real workloads here, I tried doing an svn co of the 9-STABLE source tree over NFS but after two hours it was still in llvm so I gave up. While all this not-much-of-anything NFS traffic is going on, both systems are essentially idle. The process on the client sits in "newnfs" wait state with nearly no CPU. The server is completely idle except for the occasional 0.10% in an nfsd thread, which otherwise spend their lives in rpcsvc wait state. Server iostat: $ iostat -x -w 10 md0 extended device statistics device r/s w/s kr/s kw/s qlen svc_t %b [...] md0 0.0 36.0 0.0 0.0 0 1.2 0 md0 0.0 38.8 0.0 0.0 0 1.5 0 md0 0.0 73.6 0.0 0.0 0 1.0 0 md0 0.0 53.3 0.0 0.0 0 2.5 0 md0 0.0 33.7 0.0 0.0 0 1.1 0 md0 0.0 45.5 0.0 0.0 0 1.8 0 Server nfsstat: $ nfsstat -s -w 10 GtAttr Lookup Rdlink Read Write Rename Access Rddir [...] 0 0 0 471 816 0 0 0 0 0 0 480 751 0 0 0 0 0 0 481 36 0 0 0 0 0 0 469 550 0 0 0 0 0 0 485 814 0 0 0 0 0 0 467 503 0 0 0 0 0 0 473 345 0 0 0 Client nfsstat: $ nfsstat -c -w 10 GtAttr Lookup Rdlink Read Write Rename Access Rddir [...] 0 0 0 0 518 0 0 0 0 0 0 0 498 0 0 0 0 0 0 0 503 0 0 0 0 0 0 0 474 0 0 0 0 0 0 0 525 0 0 0 0 0 0 0 497 0 0 0 Server vmstat: $ vmstat -w 10 procs memory page disks faults cpu r b w avm fre flt re pi po fr sr vt0 vt1 in sy cs us sy id [...] 0 4 0 634M 6043M 37 0 0 0 1 0 0 0 1561 46 3431 0 2 98 0 4 0 640M 6042M 62 0 0 0 28 0 0 0 1598 94 3552 0 2 98 0 4 0 648M 6042M 38 0 0 0 0 0 0 0 1609 47 3485 0 1 99 0 4 0 648M 6042M 37 0 0 0 0 0 0 0 1615 46 3667 0 2 98 0 4 0 648M 6042M 37 0 0 0 0 0 0 0 1606 45 3678 0 2 98 0 4 0 648M 6042M 37 0 0 0 0 0 1 0 1561 45 3377 0 2 98 Client vmstat: $ vmstat -w 10 procs memory page disks faults cpu r b w avm fre flt re pi po fr sr md0 da0 in sy cs us sy id [...] 0 0 0 639M 593M 33 0 0 0 1237 0 0 0 281 5575 1043 0 3 97 0 0 0 639M 591M 0 0 0 0 712 0 0 0 235 122 889 0 2 98 0 0 0 639M 583M 0 0 0 0 571 0 0 1 227 120 851 0 2 98 0 0 0 639M 592M 198 0 0 0 1212 0 0 0 251 2497 950 0 3 97 0 0 0 639M 586M 0 0 0 0 614 0 0 0 250 121 924 0 2 98 0 0 0 639M 586M 0 0 0 0 765 0 0 0 250 120 918 0 3 97 Top on the KVM host says it is 93-95% idle and that each VM sits around 7-10% CPU. So basically nobody is doing anything. There's no visible bottleneck, and I've no idea where to go from here to figure out what's going on. Does anyone have any suggestions for debugging this? Thanks!