From owner-freebsd-net@FreeBSD.ORG Mon Dec 1 21:53:41 2008 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8A0E31065676 for ; Mon, 1 Dec 2008 21:53:41 +0000 (UTC) (envelope-from venkatvenkatsubra@yahoo.com) Received: from web58307.mail.re3.yahoo.com (web58307.mail.re3.yahoo.com [68.142.236.160]) by mx1.freebsd.org (Postfix) with SMTP id 395288FC0C for ; Mon, 1 Dec 2008 21:53:41 +0000 (UTC) (envelope-from venkatvenkatsubra@yahoo.com) Received: (qmail 9272 invoked by uid 60001); 1 Dec 2008 21:53:40 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:Cc:MIME-Version:Content-Type:Message-ID; b=UgtK6KfClXmHYvGb+UD5Y8sXO/ryLEu5QZg3BPMxNp+7Ax2JUW3xu4olE40sVRchbaHOW/w4vMcpvBJXwrZEh+3KAr1QyDonjMKwPcuRUKo8hD/mPvG9BFHQHFJt1T7wSNxcLNLxr9sdkWiqkcWgA3oVkQPQm8EFJ6wh3DUowOk=; X-YMail-OSG: k2ira4UVM1no9X6gTVfeNihLPnoyCW__.WCtBRc7mWphuHXx_DEdxvxMV4hyrgFWjXI3geEaqSjMoiLfaIJMuOWmFvy9OlSFBlll0QQXv.BbvoQVWdDS4k1r9_RfpgH4bsYZN6ugxZPT2smGPQW7mgc6tBLsYv9PmZ9.W58PuNNUBteR2WkgDdiN6FIV Received: from [70.112.131.248] by web58307.mail.re3.yahoo.com via HTTP; Mon, 01 Dec 2008 13:53:40 PST X-Mailer: YahooMailRC/1155.32 YahooMailWebService/0.7.260.1 References: <200811291746.aa88825@walton.maths.tcd.ie> <49331DA0.3070804@freebsd.org> <49331F3E.2090305@freebsd.org> <538219.92538.qm@web58307.mail.re3.yahoo.com> <4934530F.20104@freebsd.org> Date: Mon, 1 Dec 2008 13:53:40 -0800 (PST) From: Venkat Venkatsubra To: Andre Oppermann MIME-Version: 1.0 Message-ID: <624505.9242.qm@web58307.mail.re3.yahoo.com> Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: David Malone , Rui Paulo , freebsd-net@freebsd.org, Kevin Oberman Subject: Re: FreeBSD Window updates X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Dec 2008 21:53:41 -0000 Hi Andre,=0A=0A>To answer your question: I do think we are fine with waitin= g for the=0A>delayed ACK.=A0 If an application starts to seriously lag behi= nd like=0A>in your example the feedback mechanism should work and cause the= sender=0A>to slow down too.=A0 The feedback loop in TCP is not only the ne= twork but=0A=0AThe case I was thinking was not apps lagging behind on the r= ead.=0AThe incoming packets got ahead and got dumped on the socket receive = buffer=0Abefore the app's blocked read could get scheduled by the OS and st= art reading=0Athe data. I am assuming the read needs the socket lock and it= has to contend for this lock=0Awith the incoming packets. The stack may no= t have any control over when the read eventually=0Agets to run. Suppose it = runs after the 13th incoming packet got copied to the socket buffer=A0=0Aan= d the window size is 13 MSS ?=0A=0A>=A0> What's the purpose of the 2 MSS ch= eck by the way ? =0A=0A>This is part of the Silly Window Syndrome preventio= n.=A0 A good description is here:=0A=0AEven without the 2 MSS check you are= going to prevent SWS, isn't that right ?=0AThe other checks will make sure= small window updates are not sent.=0A=0AVenkat=0A=0A=0A=0A________________= ________________=0AFrom: Andre Oppermann =0ATo: Venkat V= enkatsubra =0ACc: David Malone ; Rui Paulo ; Kevin Oberman ; free= bsd-net@freebsd.org=0ASent: Monday, December 1, 2008 3:11:43 PM=0ASubject: = Re: FreeBSD Window updates=0A=0AVenkat Venkatsubra wrote:=0A> Hi Andre,=0A>= =0A> When delayed Ack is set the window update is not sent.=0A> Does this = mean when odd number of packets are received and later read,=0A> a window u= pdate won't go out either till the next segment arrives or=0A> 200 msecs de= layed ack timer ? Can this reduced window block the sender from=0A> sending= the next segment that we are waiting for to open up the window ?=0A=0AYes.= =A0 The very idea of delayed ACK is to reduce the network utilization=0Aby = ACKing only every other segment.=A0 Window updates should not override=0Ath= is as they currently do.=A0 Nagle comes into plays as well where we wait=0A= for the application to write something within the delayed ACK timeout to=0A= piggyback the answer together with the ACK (and window update).=0A=0ATo ans= wer your question: I do think we are fine with waiting for the=0Adelayed AC= K.=A0 If an application starts to seriously lag behind like=0Ain your examp= le the feedback mechanism should work and cause the sender=0Ato slow down t= oo.=A0 The feedback loop in TCP is not only the network but=0Aalso the send= ing and receiving application.=A0 In a normal bulk transfer=0Awhere the rec= eiving application services the receive buffer in regular=0Aintervals we up= date the window with every ACK.=0A=0AI'm open to other ideas if they fix th= e problem David is seeing without=0Ahaving more serious shortcomings.=0A=0A= > What's the purpose of the 2 MSS check by the way ? =0A=0AThis is part of = the Silly Window Syndrome prevention.=A0 A good description is here:=0Ahttp= ://www.tcpipguide.com/free/t_TCPSillyWindowSyndromeandChangesTotheSlidingWi= ndow.htm=0A=0APS: Attached is an updated version of the patch.=A0 The flag = TF_DELACK=0Acan't be used to test for the presence of a delayed ACK.=A0 The= presence=0Aof the delack timer has to be tested.=0A=0A-- Andre=0A=0A> Venk= at=A0 =0A> =0A> =0A> =0A> ________________________________=0A> From: Andre = Oppermann =0A> To: David Malone = =0A> Cc: Rui Paulo ; freebsd-net@freebsd.org; Venkat Venka= tsubra ; Kevin Oberman =0A> Se= nt: Sunday, November 30, 2008 5:18:22 PM=0A> Subject: Re: FreeBSD Window up= dates=0A> =0A> Andre Oppermann wrote:=0A>> David Malone wrote:=0A>>> I've g= ot an example extract tcpdump of this at the end of the mail=0A>>> - here 6= ACKs are sent, 5 of which are pure window updates and=0A>>> several are 2u= s apart!=0A>>> =0A>>> I think the easy option is to delete the code that ge= nerates explicit=0A>>> window updates if the window moves by 2*MSS. We then= should be doing=0A>>> something similar to Linux. The other easy alternati= ve would be to=0A>>> add a sysclt that lets us generate an window update ev= ery N*MSS and=0A>>> by default set it to something big, like 10 or 100. Tha= t should=0A>>> effectively eliminate the updates during bulk data transfer,= but=0A>>> may still generate some window updates after a loss.=0A>> The ma= in problem of the pure window update test in tcp_output() is=0A>> its compl= ete ignorance of delayed ACKs.=A0 Second is the strict 4.4BSD=0A>> adherenc= e to sending an update for every window increase of >=3D 2*MSS.=0A>> The th= ird issue of sending a slew of window updates after having=0A>> received a = FIN (telling us the other end won't ever send more data)=0A>> I have alread= y fixed some moons ago.=0A>> =0A>> In my new-tcp work I've come across the = window update logic some time=0A>> ago and backchecked with relevant RFCs a= nd other implementations.=0A>> Attached is a compiling but otherwise untest= ed backport of the new logic.=0A> =0A> Slightly improved version attached.= =0A> =0A=0A=0A