Date: Sun, 19 Jan 2014 23:11:09 -0500 From: J David <j.david.lists@gmail.com> To: Rick Macklem <rmacklem@uoguelph.ca> Cc: freebsd-net@freebsd.org, Adam McDougall <mcdouga9@egr.msu.edu> Subject: Re: Terrible NFS performance under 9.2-RELEASE? Message-ID: <CABXB=RTJi9cLFD3U3sVVOdAasfTwJKMcxvvr8mi%2BCmLrnu_FnQ@mail.gmail.com> In-Reply-To: <1349281953.12559529.1390174577569.JavaMail.root@uoguelph.ca> References: <52DC1241.7010004@egr.msu.edu> <1349281953.12559529.1390174577569.JavaMail.root@uoguelph.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
MIME-Version: 1.0 Sender: jdavidlists@gmail.com Received: by 10.42.170.8 with HTTP; Sun, 19 Jan 2014 20:08:04 -0800 (PST) In-Reply-To: <1349281953.12559529.1390174577569.JavaMail.root@uoguelph.ca> References: <52DC1241.7010004@egr.msu.edu> <1349281953.12559529.1390174577569.JavaMail.root@uoguelph.ca> Date: Sun, 19 Jan 2014 23:08:04 -0500 Delivered-To: jdavidlists@gmail.com X-Google-Sender-Auth: 2XgnsPkoaEEkfTqW1ZVFM_Lel3o Message-ID: <CABXB=RQDpva-fiMJDiRX_TZhkoQ9kZtk6n3i6=pw1z6cad_1KQ@mail.gmail.com> Subject: Re: Terrible NFS performance under 9.2-RELEASE? From: J David <j.david.lists@gmail.com> To: Rick Macklem <rmacklem@uoguelph.ca> Content-Type: text/plain; charset=ISO-8859-1 On Sun, Jan 19, 2014 at 9:32 AM, Alfred Perlstein <alfred@freebsd.org> wrote: > I hit nearly the same problem and raising the mbufs worked for me. > > I'd suggest raising that and retrying. That doesn't seem to be an issue here; mbufs are well below max on both client and server and all the "delayed"/"denied" lines are 0/0/0. On Sun, Jan 19, 2014 at 12:58 PM, Adam McDougall <mcdouga9@egr.msu.edu> wrote: > Also try rsize=32768,wsize=32768 in your mount options, made a huge > difference for me. This does make a difference, but inconsistently. In order to test this further, I created a Debian guest on the same host as these two FreeBSD hosts and re-ran the tests with it acting as both client and server, and ran them for both 32k and 64k. Findings: random random write rewrite read reread read write S:FBSD,C:FBSD,Z:64k 67246 2923 103295 1272407 172475 196 S:FBSD,C:FBSD,Z:32k 11951 99896 223787 1051948 223276 13686 S:FBSD,C:DEB,Z:64k 11414 14445 31554 30156 30368 13799 S:FBSD,C:DEB,Z:32k 11215 14442 31439 31026 29608 13769 S:DEB,C:FBSD,Z:64k 36844 173312 313919 1169426 188432 14273 S:DEB,C:FBSD,Z:32k 66928 120660 257830 1048309 225807 18103 So the rsize/wsize makes a difference between two FreeBSD nodes, but with a Debian node as either client or server, it no longer seems to matter much. And /proc/mounts on the debian box confirms that it negotiates and honors the 64k size as a client. On Sun, Jan 19, 2014 at 6:36 PM, Rick Macklem <rmacklem@uoguelph.ca> wrote: > Yes, it shouldn't make a big difference but it sometimes does. When it > does, I believe that indicates there is a problem with your network > fabric. Given that this is an entirely virtual environment, if your belief is correct, where would supporting evidence be found? As far as I can tell, there are no interface errors reported on the host (checking both taps and the bridge) or any of the guests, nothing in sysctl dev.vtnet of concern, etc. Also the improvement from using debian on either side, even with 64k sizes, seems counterintuitive. To try to help vindicate the network stack, I did iperf -d between the two FreeBSD nodes while the iozone was running: Server: $ iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 4] local 172.20.20.162 port 5001 connected with 172.20.20.169 port 37449 ------------------------------------------------------------ Client connecting to 172.20.20.169, TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 6] local 172.20.20.162 port 28634 connected with 172.20.20.169 port 5001 Waiting for server threads to complete. Interrupt again to force quit. [ ID] Interval Transfer Bandwidth [ 6] 0.0-10.0 sec 15.8 GBytes 13.6 Gbits/sec [ 4] 0.0-10.0 sec 15.6 GBytes 13.4 Gbits/sec Client: $ iperf -c 172.20.20.162 -d ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ ------------------------------------------------------------ Client connecting to 172.20.20.162, TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 5] local 172.20.20.169 port 32533 connected with 172.20.20.162 port 5001 [ 4] local 172.20.20.169 port 5001 connected with 172.20.20.162 port 36617 [ ID] Interval Transfer Bandwidth [ 5] 0.0-10.0 sec 15.6 GBytes 13.4 Gbits/sec [ 4] 0.0-10.0 sec 15.5 GBytes 13.3 Gbits/sec mbuf usage is pretty low. Server: $ netstat -m 545/4075/4620 mbufs in use (current/cache/total) 535/1819/2354/131072 mbuf clusters in use (current/cache/total/max) 535/1641 mbuf+clusters out of packet secondary zone in use (current/cache) 0/2034/2034/12800 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) 1206K/12792K/13999K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/0/0 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile 0 calls to protocol drain routines Client: $ netstat -m 1841/3544/5385 mbufs in use (current/cache/total) 1172/1198/2370/32768 mbuf clusters in use (current/cache/total/max) 512/896 mbuf+clusters out of packet secondary zone in use (current/cache) 0/2314/2314/16384 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/8192 9k jumbo clusters in use (current/cache/total/max) 0/0/0/4096 16k jumbo clusters in use (current/cache/total/max) 2804K/12538K/15342K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/0/0 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile 0 calls to protocol drain routines Here's 60 seconds of netstat -ss for ip and tcp from the server with 64k mount running ozone: ip: 4776 total packets received 4758 packets for this host 18 packets for unknown/unsupported protocol 2238 packets sent from this host tcp: 2244 packets sent 1427 data packets (238332 bytes) 5 data packets (820 bytes) retransmitted 812 ack-only packets (587 delayed) 2235 packets received 1428 acks (for 238368 bytes) 2007 packets (91952792 bytes) received in-sequence 225 out-of-order packets (325800 bytes) 1428 segments updated rtt (of 1426 attempts) 5 retransmit timeouts 587 correct data packet header predictions 225 SACK options (SACK blocks) sent And with 32k mount: ip: 24172 total packets received 24167 packets for this host 5 packets for unknown/unsupported protocol 26130 packets sent from this host tcp: 26130 packets sent 23506 data packets (5362120 bytes) 2624 ack-only packets (454 delayed) 21671 packets received 18143 acks (for 5362192 bytes) 20278 packets (756617316 bytes) received in-sequence 96 out-of-order packets (145964 bytes) 18143 segments updated rtt (of 17469 attempts) 1093 correct ACK header predictions 3449 correct data packet header predictions 111 SACK options (SACK blocks) sent So the 32k mount sends about 6x the packet volume. (This is on iozone's linear write test.) One thing I've noticed is that when the 64k connection bogs down, it seems to "poison" things for awhile. For example, iperf will start doing this afterward: >From the client to the server: $ iperf -c 172.20.20.162 ------------------------------------------------------------ Client connecting to 172.20.20.162, TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 3] local 172.20.20.169 port 14337 connected with 172.20.20.162 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.1 sec 4.88 MBytes 4.05 Mbits/sec Ouch! That's quite a drop from 13Gbit/sec. Weirdly, iperf to the debian node not affected: >From the client to the debian node: $ iperf -c 172.20.20.166 ------------------------------------------------------------ Client connecting to 172.20.20.166, TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 3] local 172.20.20.169 port 24376 connected with 172.20.20.166 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 20.4 GBytes 17.5 Gbits/sec >From the debian node to the server: $ iperf -c 172.20.20.162 ------------------------------------------------------------ Client connecting to 172.20.20.162, TCP port 5001 TCP window size: 23.5 KByte (default) ------------------------------------------------------------ [ 3] local 172.20.20.166 port 43166 connected with 172.20.20.162 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 12.9 GBytes 11.1 Gbits/sec But if I let it run for longer, it will apprently figure things out and creep back up to normal speed and stay there until NFS strikes again. It's like the kernel is caching some sort of hint that connectivity to that other host sucks, and it has to either expire or be slowly overcome. Client: $ iperf -c 172.20.20.162 -t 60 ------------------------------------------------------------ Client connecting to 172.20.20.162, TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 3] local 172.20.20.169 port 59367 connected with 172.20.20.162 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-60.0 sec 56.2 GBytes 8.04 Gbits/sec Server: $ netstat -I vtnet1 -ihw 1 input (vtnet1) output packets errs idrops bytes packets errs bytes colls 7 0 0 420 0 0 0 0 7 0 0 420 0 0 0 0 8 0 0 480 0 0 0 0 8 0 0 480 0 0 0 0 7 0 0 420 0 0 0 0 6 0 0 360 0 0 0 0 6 0 0 360 0 0 0 0 6 0 0 360 0 0 0 0 6 0 0 360 0 0 0 0 6 0 0 360 0 0 0 0 6 0 0 360 0 0 0 0 6 0 0 360 0 0 0 0 6 0 0 360 0 0 0 0 11 0 0 12k 3 0 206 0 <--- starts here 17 0 0 227k 10 0 660 0 17 0 0 408k 10 0 660 0 17 0 0 417k 10 0 660 0 17 0 0 425k 10 0 660 0 17 0 0 438k 10 0 660 0 17 0 0 444k 10 0 660 0 16 0 0 453k 10 0 660 0 input (vtnet1) output packets errs idrops bytes packets errs bytes colls 16 0 0 463k 10 0 660 0 16 0 0 469k 10 0 660 0 16 0 0 482k 10 0 660 0 16 0 0 487k 10 0 660 0 16 0 0 496k 10 0 660 0 16 0 0 504k 10 0 660 0 18 0 0 510k 10 0 660 0 16 0 0 521k 10 0 660 0 17 0 0 524k 10 0 660 0 17 0 0 538k 10 0 660 0 17 0 0 540k 10 0 660 0 17 0 0 552k 10 0 660 0 17 0 0 554k 10 0 660 0 17 0 0 567k 10 0 660 0 16 0 0 568k 10 0 660 0 16 0 0 581k 10 0 660 0 16 0 0 582k 10 0 660 0 16 0 0 595k 10 0 660 0 16 0 0 595k 10 0 660 0 16 0 0 609k 10 0 660 0 16 0 0 609k 10 0 660 0 input (vtnet1) output packets errs idrops bytes packets errs bytes colls 16 0 0 620k 10 0 660 0 16 0 0 623k 10 0 660 0 17 0 0 632k 10 0 660 0 17 0 0 637k 10 0 660 0 8.7k 0 0 389M 4.4k 0 288k 0 42k 0 0 2.1G 21k 0 1.4M 0 41k 0 0 2.1G 20k 0 1.4M 0 38k 0 0 1.9G 19k 0 1.2M 0 40k 0 0 2.0G 20k 0 1.3M 0 40k 0 0 2.0G 20k 0 1.3M 0 40k 0 0 2G 20k 0 1.3M 0 39k 0 0 2G 20k 0 1.3M 0 43k 0 0 2.2G 22k 0 1.4M 0 42k 0 0 2.2G 21k 0 1.4M 0 39k 0 0 2G 19k 0 1.3M 0 38k 0 0 1.9G 19k 0 1.2M 0 42k 0 0 2.1G 21k 0 1.4M 0 44k 0 0 2.2G 22k 0 1.4M 0 41k 0 0 2.1G 20k 0 1.3M 0 41k 0 0 2.1G 21k 0 1.4M 0 40k 0 0 2.0G 20k 0 1.3M 0 input (vtnet1) output packets errs idrops bytes packets errs bytes colls 43k 0 0 2.2G 22k 0 1.4M 0 41k 0 0 2.1G 20k 0 1.3M 0 40k 0 0 2.0G 20k 0 1.3M 0 42k 0 0 2.2G 21k 0 1.4M 0 39k 0 0 2G 19k 0 1.3M 0 42k 0 0 2.1G 21k 0 1.4M 0 40k 0 0 2.0G 20k 0 1.3M 0 42k 0 0 2.1G 21k 0 1.4M 0 38k 0 0 2G 19k 0 1.3M 0 39k 0 0 2G 20k 0 1.3M 0 45k 0 0 2.3G 23k 0 1.5M 0 6 0 0 360 0 0 0 0 6 0 0 360 0 0 0 0 6 0 0 360 0 0 0 0 6 0 0 360 0 0 0 0 It almost looks like something is limiting it to 10 packets per second. So confusing! TCP super slow start? Thanks! (Sorry Rick, forgot to reply all so you got an extra! :( ) Also, here's the netstat from the client side showing the 10 packets per second limit and eventual recovery: $ netstat -I net1 -ihw 1 input (net1) output packets errs idrops bytes packets errs bytes colls 6 0 0 360 0 0 0 0 6 0 0 360 0 0 0 0 6 0 0 360 0 0 0 0 6 0 0 360 0 0 0 0 15 0 0 962 11 0 114k 0 17 0 0 1.1k 10 0 368k 0 17 0 0 1.1k 10 0 411k 0 17 0 0 1.1k 10 0 425k 0 17 0 0 1.1k 10 0 432k 0 17 0 0 1.1k 10 0 439k 0 17 0 0 1.1k 10 0 452k 0 16 0 0 1k 10 0 457k 0 16 0 0 1k 10 0 467k 0 16 0 0 1k 10 0 477k 0 16 0 0 1k 10 0 481k 0 16 0 0 1k 10 0 495k 0 16 0 0 1k 10 0 498k 0 16 0 0 1k 10 0 510k 0 16 0 0 1k 10 0 515k 0 16 0 0 1k 10 0 524k 0 17 0 0 1.1k 10 0 532k 0 input (net1) output packets errs idrops bytes packets errs bytes colls 17 0 0 1.1k 10 0 538k 0 17 0 0 1.1k 10 0 548k 0 17 0 0 1.1k 10 0 552k 0 17 0 0 1.1k 10 0 562k 0 17 0 0 1.1k 10 0 566k 0 16 0 0 1k 10 0 576k 0 16 0 0 1k 10 0 580k 0 16 0 0 1k 10 0 590k 0 17 0 0 1.1k 10 0 594k 0 16 0 0 1k 10 0 603k 0 16 0 0 1k 10 0 609k 0 16 0 0 1k 10 0 614k 0 16 0 0 1k 10 0 623k 0 16 0 0 1k 10 0 626k 0 17 0 0 1.1k 10 0 637k 0 18 0 0 1.1k 10 0 637k 0 17k 0 0 1.1M 34k 0 1.7G 0 21k 0 0 1.4M 42k 0 2.1G 0 20k 0 0 1.3M 39k 0 2G 0 19k 0 0 1.2M 38k 0 1.9G 0 20k 0 0 1.3M 41k 0 2.0G 0 input (net1) output packets errs idrops bytes packets errs bytes colls 20k 0 0 1.3M 40k 0 2.0G 0 19k 0 0 1.2M 38k 0 1.9G 0 22k 0 0 1.5M 45k 0 2.3G 0 20k 0 0 1.3M 40k 0 2.1G 0 20k 0 0 1.3M 40k 0 2.1G 0 18k 0 0 1.2M 36k 0 1.9G 0 21k 0 0 1.4M 41k 0 2.1G 0 22k 0 0 1.4M 44k 0 2.2G 0 21k 0 0 1.4M 43k 0 2.2G 0 20k 0 0 1.3M 41k 0 2.1G 0 20k 0 0 1.3M 40k 0 2.0G 0 21k 0 0 1.4M 43k 0 2.2G 0 21k 0 0 1.4M 43k 0 2.2G 0 20k 0 0 1.3M 40k 0 2.0G 0 21k 0 0 1.4M 43k 0 2.2G 0 19k 0 0 1.2M 38k 0 1.9G 0 21k 0 0 1.4M 42k 0 2.1G 0 20k 0 0 1.3M 40k 0 2.0G 0 21k 0 0 1.4M 42k 0 2.1G 0 20k 0 0 1.3M 40k 0 2.0G 0 20k 0 0 1.3M 40k 0 2.0G 0 input (net1) output packets errs idrops bytes packets errs bytes colls 24k 0 0 1.6M 48k 0 2.5G 0 6.3k 0 0 417k 12k 0 647M 0 6 0 0 360 0 0 0 0 6 0 0 360 0 0 0 0 6 0 0 360 0 0 0 0 6 0 0 360 0 0 0 0 6 0 0 360 0 0 0 0
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CABXB=RTJi9cLFD3U3sVVOdAasfTwJKMcxvvr8mi%2BCmLrnu_FnQ>