From owner-svn-src-head@freebsd.org Sun Apr 10 20:58:03 2016 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2713FB0B70A for ; Sun, 10 Apr 2016 20:58:03 +0000 (UTC) (envelope-from Cheng.Cui@netapp.com) Received: from mx141.netapp.com (mx141.netapp.com [216.240.21.12]) (using TLSv1.2 with cipher RC4-SHA (128/128 bits)) (Client CN "mx141.netapp.com", Issuer "Symantec Class 3 Secure Server CA - G4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E8EF017C2 for ; Sun, 10 Apr 2016 20:58:02 +0000 (UTC) (envelope-from Cheng.Cui@netapp.com) X-IronPort-AV: E=Sophos;i="5.24,462,1455004800"; d="scan'208";a="111688461" Received: from hioexcmbx06-prd.hq.netapp.com ([10.122.105.39]) by mx141-out.netapp.com with ESMTP; 10 Apr 2016 13:53:01 -0700 Received: from HIOEXCMBX03-PRD.hq.netapp.com (10.122.105.36) by hioexcmbx06-prd.hq.netapp.com (10.122.105.39) with Microsoft SMTP Server (TLS) id 15.0.1156.6; Sun, 10 Apr 2016 13:53:00 -0700 Received: from HIOEXCMBX03-PRD.hq.netapp.com ([::1]) by hioexcmbx03-prd.hq.netapp.com ([fe80::d960:dcd6:b477:1dbd%21]) with mapi id 15.00.1156.000; Sun, 10 Apr 2016 13:53:00 -0700 From: "Cui, Cheng" To: Hans Petter Selasky CC: "svn-src-head@freebsd.org" Subject: Re: question about trimning data "len" conditions in TSO in tcp_output.c Thread-Topic: question about trimning data "len" conditions in TSO in tcp_output.c Thread-Index: AQHRGNRH3ZqdnMKQ9kSAjiggGnHKap6QBSEAgAALXQCAAAv6AIAKslAAgOnKPACAAD2MgA== Date: Sun, 10 Apr 2016 20:52:59 +0000 Message-ID: References: <563D1892.3050406@selasky.org> <563D2C26.2070300@selasky.org> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: x-ms-exchange-messagesentrepresentingtype: 1 x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.120.60.34] Content-Type: multipart/mixed; boundary="_002_D33034DFFE6DChengCuinetappcom_" MIME-Version: 1.0 X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Apr 2016 20:58:03 -0000 --_002_D33034DFFE6DChengCuinetappcom_ Content-Type: text/plain; charset="iso-8859-1" Content-ID: Content-Transfer-Encoding: quoted-printable Sorry, the path I attached in previous email was against the FreeBSD 10.2 release. This one attached should be the one against FreeBSD HEAD. diff --git a/sys/netinet/tcp_output.c b/sys/netinet/tcp_output.c index 2043fc9..43b0737 100644 --- a/sys/netinet/tcp_output.c +++ b/sys/netinet/tcp_output.c @@ -939,23 +939,15 @@ send: * emptied: */ max_len =3D (tp->t_maxseg - optlen); - if ((off + len) < sbavail(&so->so_snd)) { + if (len > (max_len << 1)) { moff =3D len % max_len; if (moff !=3D 0) { len -=3D moff; sendalot =3D 1; } } - - /* - * In case there are too many small fragments - * don't use TSO: - */ - if (len <=3D max_len) { - len =3D max_len; - sendalot =3D 1; - tso =3D 0; - } + KASSERT(len >=3D max_len, + ("[%s:%d]: len < max_len", __func__, __LINE__)); =20 /* * Send the FIN in a separate segment Thanks, --Cheng Cui NetApp Scale Out Networking On 4/10/16, 4:44 PM, "Cui, Cheng" wrote: >Hi Hans, > >I would continue this discussion with a different change. The piece of >change is >here and also I attached the patch "change.patch" against the FreeBSD HEAD >code-line. > >diff --git a/sys/netinet/tcp_output.c b/sys/netinet/tcp_output.c >index 2043fc9..fa124f1 100644 >--- a/sys/netinet/tcp_output.c >+++ b/sys/netinet/tcp_output.c >@@ -938,25 +938,16 @@ send: > * fractional unless the send sockbuf can be > * emptied: > */ >- max_len =3D (tp->t_maxseg - optlen); >- if ((off + len) < sbavail(&so->so_snd)) { >+ max_len =3D (tp->t_maxopd - optlen); >+ if (len > (max_len << 1)) { > moff =3D len % max_len; > if (moff !=3D 0) { > len -=3D moff; > sendalot =3D 1; > } > } >- >- /* >- * In case there are too many small fragments >- * don't use TSO: >- */ >- if (len <=3D max_len) { >- len =3D max_len; >- sendalot =3D 1; >- tso =3D 0; >- } >- >+ KASSERT(len > max_len, >+ ("[%s:%d]: len <=3D max_len", __func__, >__LINE__)); > /* > * Send the FIN in a separate segment > * after the bulk sending is done. > >I think this change could save additional loops that send single MSS-size >packets. So I think some CPU cycles can be saved as well, due to this >change=20 >reduced software sends and pushed more data to offloading sends. > >Here is my test. The iperf command I choose pushes 100Mbytes data to the >wire by setting the default TCP sendspace to 1MB and recvspace to 2MB. I >tested this TCP connection performance on a pair of 10Gbps FreeBSD 10.2 >nodes=20 >(s1 and r1) with a switch in between. Both nodes have TSO and delayed ACK >enabled.=20 > >root@s1:~ # ping -c 3 r1 >PING r1-link1 (10.1.2.3): 56 data bytes >64 bytes from 10.1.2.3: icmp_seq=3D0 ttl=3D64 time=3D0.045 ms >64 bytes from 10.1.2.3: icmp_seq=3D1 ttl=3D64 time=3D0.037 ms >64 bytes from 10.1.2.3: icmp_seq=3D2 ttl=3D64 time=3D0.038 ms > >--- r1-link1 ping statistics --- >3 packets transmitted, 3 packets received, 0.0% packet loss >round-trip min/avg/max/stddev =3D 0.037/0.040/0.045/0.004 ms > >1M snd buffer/2M rcv buffer >sysctl -w net.inet.tcp.hostcache.expire=3D1 >sysctl -w net.inet.tcp.sendspace=3D1048576 >sysctl -w net.inet.tcp.recvspace=3D2097152 > >iperf -s <=3D=3D iperf command@receiver >iperf -c r1 -m -n 100M <=3D=3D iperf command@sender > >root@s1:~ # iperf -c r1 -m -n 100M >------------------------------------------------------------ >Client connecting to r1, TCP port 5001 >TCP window size: 1.00 MByte (default) >------------------------------------------------------------ >[ 3] local 10.1.2.2 port 22491 connected with 10.1.2.3 port 5001 >[ ID] Interval Transfer Bandwidth >[ 3] 0.0- 0.3 sec 100 MBytes 2.69 Gbits/sec >[ 3] MSS size 1448 bytes (MTU 1500 bytes, ethernet) > >root@r1:~ # iperf -s >------------------------------------------------------------ >Server listening on TCP port 5001 >TCP window size: 2.00 MByte (default) >------------------------------------------------------------ >[ 4] local 10.1.2.3 port 5001 connected with 10.1.2.2 port 22491 >[ ID] Interval Transfer Bandwidth >[ 4] 0.0- 0.3 sec 100 MBytes 2.62 Gbits/sec > >Each test sent 100MBytes of data, and I collected the packet trace from >both=20 >nodes by tcpdump. I did this test twice to confirm the result can be >reproduced. > >From the trace files of both nodes before my code change, I see a lot of >single-MSS size packets. See the attached trace files in >"before_change.zip". >For example, in a sender trace file I see 43480 single-MSS size >packets(tcp.len=3D=3D1448) out of 57005 packets that contain data(tcp.len = > >0).=20 >That's 76.2%. > >And I did the same iperf test and gathered trace files. I did not find >many single-MSS packets this time. See the attached trace files in >"after_change.zip". For example, in a sender trace file I see zero >single-MSS=20 >size packets(tcp.len=3D=3D1448) out of 35729 data packets(tcp.len > 0). > >Compared with the receiver traces, I did not see significant more >fractional=20 >packets received after change. > >I also did tests using netperf, although I did not get enough 95% >confidence for >every test on snd/rcv buffer size. Attached are my netperf result on >different >snd/rcv buffer size before and after the change (netperf_before_change.txt >and=20 >netperf_after_change.txt), which also look good. > >used netperf command: >netperf -H s1 -t TCP_STREAM -C -c -l 400 -i 10,3 -I 95,10 -- -s >${LocalSndBuf} -S ${RemoteSndBuf} > > >Thanks, >--Cheng Cui >NetApp Scale Out Networking > --_002_D33034DFFE6DChengCuinetappcom_ Content-Type: application/octet-stream; name="change.patch" Content-Description: change.patch Content-Disposition: attachment; filename="change.patch"; size=737; creation-date="Sun, 10 Apr 2016 20:52:59 GMT"; modification-date="Sun, 10 Apr 2016 20:52:59 GMT" Content-ID: <4F635E5844F28D47AF2B83FE20BA1844@hq.netapp.com> Content-Transfer-Encoding: base64 ZGlmZiAtLWdpdCBhL3N5cy9uZXRpbmV0L3RjcF9vdXRwdXQuYyBiL3N5cy9uZXRpbmV0L3RjcF9v dXRwdXQuYwppbmRleCAyMDQzZmM5Li40M2IwNzM3IDEwMDY0NAotLS0gYS9zeXMvbmV0aW5ldC90 Y3Bfb3V0cHV0LmMKKysrIGIvc3lzL25ldGluZXQvdGNwX291dHB1dC5jCkBAIC05MzksMjMgKzkz OSwxNSBAQCBzZW5kOgogCQkJICogZW1wdGllZDoKIAkJCSAqLwogCQkJbWF4X2xlbiA9ICh0cC0+ dF9tYXhzZWcgLSBvcHRsZW4pOwotCQkJaWYgKChvZmYgKyBsZW4pIDwgc2JhdmFpbCgmc28tPnNv X3NuZCkpIHsKKwkJCWlmIChsZW4gPiAobWF4X2xlbiA8PCAxKSkgewogCQkJCW1vZmYgPSBsZW4g JSBtYXhfbGVuOwogCQkJCWlmIChtb2ZmICE9IDApIHsKIAkJCQkJbGVuIC09IG1vZmY7CiAJCQkJ CXNlbmRhbG90ID0gMTsKIAkJCQl9CiAJCQl9Ci0KLQkJCS8qCi0JCQkgKiBJbiBjYXNlIHRoZXJl IGFyZSB0b28gbWFueSBzbWFsbCBmcmFnbWVudHMKLQkJCSAqIGRvbid0IHVzZSBUU086Ci0JCQkg Ki8KLQkJCWlmIChsZW4gPD0gbWF4X2xlbikgewotCQkJCWxlbiA9IG1heF9sZW47Ci0JCQkJc2Vu ZGFsb3QgPSAxOwotCQkJCXRzbyA9IDA7Ci0JCQl9CisJCQlLQVNTRVJUKGxlbiA+PSBtYXhfbGVu LAorCQkJICAgICgiWyVzOiVkXTogbGVuIDwgbWF4X2xlbiIsIF9fZnVuY19fLCBfX0xJTkVfXykp OwogCiAJCQkvKgogCQkJICogU2VuZCB0aGUgRklOIGluIGEgc2VwYXJhdGUgc2VnbWVudAo= --_002_D33034DFFE6DChengCuinetappcom_--