From owner-freebsd-net@FreeBSD.ORG Sun Aug 10 03:32:15 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7780379F for ; Sun, 10 Aug 2014 03:32:15 +0000 (UTC) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "funkthat.com", Issuer "funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 317492D5A for ; Sun, 10 Aug 2014 03:32:14 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id s7A3WDP6005867 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 9 Aug 2014 20:32:13 -0700 (PDT) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id s7A3WCZp005866; Sat, 9 Aug 2014 20:32:12 -0700 (PDT) (envelope-from jmg) Date: Sat, 9 Aug 2014 20:32:12 -0700 From: John-Mark Gurney To: Niu Zhixiong Subject: Re: A problem on TCP in High RTT Environment. Message-ID: <20140810033212.GL83475@funkthat.com> Mail-Followup-To: Niu Zhixiong , Michael Tuexen , freebsd-net@freebsd.org, Bill Yuan References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Sat, 09 Aug 2014 20:32:13 -0700 (PDT) Cc: Michael Tuexen , Bill Yuan , freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Aug 2014 03:32:15 -0000 Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:50 +0800: > I am sorry that I upload a WRONG SCTP capture. But, the throughput is same. > SCTP is double than TCP, about 18Mbps. > ??? > sctp_2.pcapng.gz > > ??? Ok, the owin graph is very interesting... We do have a full 2MB window on the receiver side, but for some reason, we only ever have just under 6k outstanding on the connection... So, it looks like we send for a short period of time, and then stop sending... Do you have LRO enabled? I think it might be related to: https://svnweb.freebsd.org/changeset/base/r256920 As I'm seeing >100ms gaps where the sender doesn't send any data, and as soon as more than one ack comes in, the next segment goes out... If we only receive a single ack, then we wait for a timeout before sending the next segment.. Can you try to disable LRO on the receiving host? ifconfig -lro And see if that helps... If it does... Applying the patch, or compiling a more recent kernel from stable/10 that is after r257367 as that is was the date that the change was merged... > On Sun, Aug 10, 2014 at 10:42 AM, Niu Zhixiong wrote: > > > I am sure that wnd is about 2MB all the time. > > This is my latest capture, plz see Google Drive. > > In the latest test, TCP(0s-120s) is about 9Mbps and SCTP(0s-120s) is about > > 18Mbps. > > (The bandwidth(20Mbps) and delay(200ms) is set by dummynet) > > The SCTP and TCP are tested in same environment. > > > > ??? > > sctp.pcapng.gz > > > > ?????? > > tcp.pcapng.gz > > > > ??? > > > > > > > > Regards, > > Niu Zhixiong > > ????????????????????????????????????????????? > > kaiaixi@gmail.com > > > > > > On Sun, Aug 10, 2014 at 10:23 AM, John-Mark Gurney > > wrote: > > > >> Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:12 +0800: > >> > During the TCP4 transmission. > >> > Proto Recv-Q Send-Q Local Address Foreign Address > >> (state) > >> > tcp4 0 2097346 10.0.10.2.13504 10.0.10.3.9000 > >> > ESTABLISHED > >> > >> Ok, so you are getting a full 2MB in there, and w/ that, you should > >> easily be saturating your pipe... > >> > >> The next thing would be to get a tcpdump, and take a look at the > >> window size.. Wireshark has lots of neat tools to make this analysis > >> easy... Another tool that is good is tcptrace.. It can output a > >> variety of different graphs that will help you track down, and see > >> what part of the system is the problem... > >> > >> You probably only need a few tens of seconds of the tcpdump... > >> > >> > On Sun, Aug 10, 2014 at 4:58 AM, Michael Tuexen < > >> > Michael.Tuexen@lurchi.franken.de> wrote: > >> > > >> > > > >> > > On 09 Aug 2014, at 22:45, John-Mark Gurney wrote: > >> > > > >> > > > Michael Tuexen wrote this message on Sat, Aug 09, 2014 at 21:51 > >> +0200: > >> > > >> > >> > > >> On 09 Aug 2014, at 20:42, John-Mark Gurney > >> wrote: > >> > > >> > >> > > >>> Niu Zhixiong wrote this message on Fri, Aug 08, 2014 at 20:34 > >> +0800: > >> > > >>>> Dear all, > >> > > >>>> > >> > > >>>> Last month, I send problems related to FTP/TCP in a high RTT > >> > > environment. > >> > > >>>> After that, I setup a simulation environment(Dummynet) to test > >> TCP > >> > > and SCTP > >> > > >>>> in high delay environment. After finishing the test, I can see > >> TCP is > >> > > >>>> always slower than SCTP. But, I think it is not possible. (Plz > >> see the > >> > > >>>> figure in the attachment). When the delay is 200ms(means > >> RTT=400ms). > >> > > >>>> Besides, the TCP is extremely slow. > >> > > >>>> > >> > > >>>> ALL BW=20Mbps, DELAY= 0 ~ 200MS, Packet LOSS = 0 (by dummynet) > >> > > >>>> > >> > > >>>> This is my parameters: > >> > > >>>> FreeBSD vfreetest0 10.0-RELEASE FreeBSD 10.0-RELEASE #0: Thu Aug > >> 7 > >> > > >>>> 11:04:15 HKT 2014 > >> > > >>>> > >> > > >>>> sysctl net.inet.tcp > >> > > >>> > >> > > >>> [...] > >> > > >>> > >> > > >>>> net.inet.tcp.recvbuf_auto: 0 > >> > > >>> > >> > > >>> [...] > >> > > >>> > >> > > >>>> net.inet.tcp.sendbuf_auto: 0 > >> > > >>> > >> > > >>> Try enabling this... This should allow the buffer to grow large > >> enough > >> > > >>> to deal w/ the higher latency... > >> > > >>> > >> > > >>> Also, make sure your program isn't setting the recv buffer size > >> as that > >> > > >>> will disable the auto growing... > >> > > >> I think the program sets the buffer to 2MB, which it also does for > >> SCTP. > >> > > >> So having both statically at the same size makes sense for the > >> > > comparison. > >> > > >> I remember that there was a bug in the combination of LRO and > >> delayed > >> > > ACK, > >> > > >> which was fixed, but I don't remember it was fixed before 10.0... > >> > > > > >> > > > Sounds like disabling LRO and TSO would be a useful test to see if > >> that > >> > > > improves things... But hiren said that the fix made it, so... > >> > > > > >> > > >>> If you use netstat -a, you should be able to see the send-q on the > >> > > >>> sender grow as necessary... > >> > > > > >> > > > Also, getting the send-q output while it's running would let us know > >> > > > if the buffer is getting to 2MB or not... > >> > > That is correct. Niu: Can you provide this? -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."