Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 08 Apr 2015 10:18:33 -0700
From:      Lawrence Stewart <lstewart@freebsd.org>
To:        Marek Salwerowicz <marek_sal@wp.pl>,  "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Subject:   Re: RTT and TCP Window size doubts, bandwidth issues
Message-ID:  <552562E9.9060103@freebsd.org>
In-Reply-To: <552455EB.7080609@wp.pl>
References:  <552455EB.7080609@wp.pl>

next in thread | previous in thread | raw e-mail | index | archive | help
On 04/07/15 15:10, Marek Salwerowicz wrote:
> Hi list,
> 
> I am trying to find correct setup of sysctl's for following machines
> (VMs under Vmware Workstation 8) to test large TCP window size:
> 
> 
> There are 2 boxes, each of them has following setup:
> - % uname -a
> FreeBSD freeA 10.1-RELEASE-p6 FreeBSD 10.1-RELEASE-p6 #0: Tue Feb 24
> 19:00:21 UTC 2015
> root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64
> 
> - 4GB of RAM
> - 1 NIC
> 
> without any modifications, iperf reports bandwidth speed ~1Gbit/s
> between hosts:
> 
> server-side:
> # iperf -s
> ------------------------------------------------------------
> Server listening on TCP port 5001
> TCP window size: 64.0 KByte (default)
> ------------------------------------------------------------
> 
> client-side:
> 
> # iperf -c 192.168.108.140
> ------------------------------------------------------------
> Client connecting to 192.168.108.140, TCP port 5001
> TCP window size: 32.5 KByte (default)
> ------------------------------------------------------------
> [  3] local 192.168.108.141 port 35282 connected with 192.168.108.140
> port 5001
> [ ID] Interval       Transfer     Bandwidth
> [  3]  0.0-10.0 sec  1.46 GBytes  1.25 Gbits/sec
> 
> 
> 
> I want to simulate (using dummynet) link between hosts with bandwidth
> 400mbit/s and latency ~20ms.
> In order to do that, I create the ipfw pipe on one box:
> IPFW="ipfw -q"
> 
> 
> $IPFW pipe 1 config bw 400Mbit/s delay 10ms
> 
> $IPFW add 1500 pipe 1 ip from any to any

You should set up 2 pipes, one for each direction of traffic, and you
also might want to explicitly set the queue size in bytes in addition to
the bw and delay parameters.

> after running ipfw, bandwidth with default kernel sysctl becomes lower:
> 
> client-side:
> ------------------------------------------------------------
> Client connecting to 192.168.108.140, TCP port 5001
> TCP window size: 32.5 KByte (default)
> ------------------------------------------------------------
> [  3] local 192.168.108.141 port 35340 connected with 192.168.108.140
> port 5001
> [ ID] Interval       Transfer     Bandwidth
> [  3]  0.0-10.1 sec  12.5 MBytes  10.4 Mbits/sec
> 
> 
> 
> I'd like to achieve bandwidth ~400Mbit/s.

Better calculate the BDP then: 400E6 x 20E-3 = 977Kb, so you'll want to
make sure your socket buffers and window can accommodation ~1Mb of data.

> I've modified following sysctl's (both on client- and server-side):
> 
> kern.ipc.maxsockbuf=33554432  # (default 2097152)
> 
> net.inet.tcp.sendbuf_max=33554432  # (default 2097152)
> net.inet.tcp.recvbuf_max=33554432  # (default 2097152)
> 
> net.inet.tcp.cc.algorithm=htcp  # (default newreno) #enabled in
> /boot/loader.conf also
> net.inet.tcp.cc.htcp.adaptive_backoff=1 # (default 0 ; disabled)
> 
> net.inet.tcp.cc.htcp.rtt_scaling=1 # (default 0 ; disabled)

You shouldn't need to tune any of the above, and NewReno should be
capable of filling a 1MB pipe.

> net.inet.tcp.mssdflt=1460  # (default 536)
> 
> net.inet.tcp.minmss=1300   # (default 216)

You should not change either of these.

> net.inet.tcp.rfc1323=1  # (default 1)
> net.inet.tcp.rfc3390=1  # (default 1)

3390 has no effect if you have net.inet.tcp.experimental.initcwnd10=1
which is default on 10.X (unfortunately, IMO).

> net.inet.tcp.sendspace=8388608  # (default 32768)
> net.inet.tcp.recvspace=8388608  # (default 65536)

You shouldn't need to tune these if sockbuf autotuning is enabled, which
will grow the default up to sendbuf_max, which defaults to larger than
1MB so you should be fine.

> net.inet.tcp.sendbuf_inc=32768  # (default 8192 )
> net.inet.tcp.recvbuf_inc=65536  # (default 16384)

Tuning these is ok though also probably unnecessary.

> But the  results are not really good as I expected:
> 
> server-side:
> # iperf -s
> ------------------------------------------------------------
> Server listening on TCP port 5001
> TCP window size: 8.00 MByte (default)
> ------------------------------------------------------------
> 
> client-side:
> # iperf -c 192.168.108.140
> ------------------------------------------------------------
> Client connecting to 192.168.108.140, TCP port 5001
> TCP window size: 8.00 MByte (default)
> ------------------------------------------------------------
> [  3] local 192.168.108.141 port 21894 connected with 192.168.108.140
> port 5001
> [ ID] Interval       Transfer     Bandwidth
> [  3]  0.0-10.1 sec  24.2 MBytes  20.2 Mbits/sec
> 
> 
> I was trying to follow the articles:
> - http://www.psc.edu/index.php/networking/641-tcp-tune
> - https://fasterdata.es.net/host-tuning/freebsd/
> 
> But can't really figure out what / how should be tuned in order to
> achieve good results.
> 
> If anyone could find some time and give me some hints, I'd be pleased!

On the sender (iperf client):

kldload siftr
sysctl net.inet.siftr.enabled=1
<run iperf test>
sysctl net.inet.siftr.enabled=0

You can then post process the /var/log/siftr.log to pull out just the
iperf related lines:

cat /var/log/siftr.log | grep "192.168.108.140,5001" > iperf.siftr

Then you can look at socket buffer occupancies, cwnd vs snd_wnd vs time,
loss recovery, etc to see why things are not behaving.

Cheers,
Lawrence





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?552562E9.9060103>