From owner-freebsd-net@FreeBSD.ORG Sun Feb 10 05:05:47 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 927BF919; Sun, 10 Feb 2013 05:05:47 +0000 (UTC) (envelope-from kob6558@gmail.com) Received: from mail-da0-f45.google.com (mail-da0-f45.google.com [209.85.210.45]) by mx1.freebsd.org (Postfix) with ESMTP id 66A91961; Sun, 10 Feb 2013 05:05:47 +0000 (UTC) Received: by mail-da0-f45.google.com with SMTP id w4so2324852dam.4 for ; Sat, 09 Feb 2013 21:05:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=hldmztYVhgMyR4e3Y6mp1nsE+DMjsTq3bL99avofesA=; b=rXTcR6exzFpvlttiWo59WJCkMcC9sv0CQ47wcuDZ0d9UTO5Gra0FNaV1HANyA+TcDO 6hngwolVN2t0Nb9Z5IVBQne1XSMXHC1Md9Akrii431v3+rVJHokl7ySpMK77wq6MuHTx 1CtOKJGPgaCAn1MgX691INfy0wDlv3gMdFcNr9NqRzJmStCsY6qc2vhpqlm7JF08O8T9 uacT6Reoh7rzE6nkPAQXh2aUmRadbunaQm4V8jm/4LRkZqT/SLgKx1iaQNAlfgJmgBNa 157xC1xBAPLN0L4qhx7ieZzjRK2r4KenGe3VcCxp9gJ3m2qJH2gmw1kguHQqT7jMvr78 KIpw== MIME-Version: 1.0 X-Received: by 10.68.117.105 with SMTP id kd9mr8351713pbb.6.1360472741744; Sat, 09 Feb 2013 21:05:41 -0800 (PST) Received: by 10.67.2.65 with HTTP; Sat, 9 Feb 2013 21:05:41 -0800 (PST) In-Reply-To: <51166019.9040104@mu.org> References: <201301221511.02496.jhb@freebsd.org> <50FF06AD.402@networx.ch> <061B4EA5-6A93-48A0-A269-C2C3A3C7E77C@lakerest.net> <201302060746.43736.jhb@freebsd.org> <511292C9.4040307@mu.org> <51166019.9040104@mu.org> Date: Sat, 9 Feb 2013 21:05:41 -0800 Message-ID: Subject: Re: [PATCH] Add a new TCP_IGNOREIDLE socket option From: Kevin Oberman To: Alfred Perlstein Content-Type: text/plain; charset=UTF-8 Cc: Randall Stewart , John Baldwin , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Feb 2013 05:05:47 -0000 On Sat, Feb 9, 2013 at 6:41 AM, Alfred Perlstein wrote: > On 2/7/13 12:04 PM, George Neville-Neil wrote: >> >> On Feb 6, 2013, at 12:28 , Alfred Perlstein wrote: >> >>> On 2/6/13 4:46 AM, John Baldwin wrote: >>>> >>>> On Wednesday, February 06, 2013 6:27:04 am Randall Stewart wrote: >>>>> >>>>> John: >>>>> >>>>> A burst at line rate will *often* cause drops. This is because >>>>> router queues are at a finite size. Also such a burst (especially >>>>> on a long delay bandwidth network) cause your RTT to increase even >>>>> if there is no drop which is going to hurt you as well. >>>>> >>>>> A SHOULD in an RFC says you really really really really need to do it >>>>> unless there is some thing that makes you willing to override it. It is >>>>> slight wiggle room. >>>>> >>>>> In this I agree with Andre, we should not be *not* doing it. Otherwise >>>>> folks will be turning this on and it is plain wrong. It may be fine >>>>> for your network but I would not want to see it in FreeBSD. >>>>> >>>>> In my testing here at home I have put back into our stack max-burst. >>>>> This >>>>> uses Mark Allman's version (not Kacheong Poon's) where you clamp the >>>>> cwnd at >>>>> no more than 4 packets larger than your flight. All of my testing >>>>> high-bw-delay or lan has shown this to improve TCP performance. This >>>>> is because it helps you avoid bursting out so many packets that you >>>>> overflow >>>>> a queue. >>>>> >>>>> In your long-delay bw link if you do burst out too many (and you never >>>>> know how many that is since you can not predict how full all those >>>>> MPLS queues are or how big they are) you will really hurt yourself even >>>>> worse. >>>>> Note that generally in Cisco routers the default queue size is >>>>> somewhere between >>>>> 100-300 packets depending on the router. >>>> >>>> Due to the way our application works this never happens, but I am fine >>>> with >>>> just keeping this patch private. If there are other shops that need >>>> this they >>>> can always dig the patch up from the archives. >>>> >>> This is yet another time when I'm sad about how things happen in FreeBSD. >>> >>> A developer come forward with a non-default option that's very useful for >>> some specific workloads, specifically one that contributes much time and $$$ >>> to the project and the community rejects the patches even though it's been >>> successful in other OSes. >>> >>> It makes zero sense. >>> >>> John, can you repost the patch? Maybe there is a way to refactor this >>> somehow so it's like accept filters where we can plug in a hook for TCP? >>> >>> I am very disappointed, but not surprised. >>> >> I take away the complete opposite feeling. This is how we work through >> these issues. >> It's clear from the discussion that this need not be a default in the >> system, >> and is a special case. We had a reasoned discussion of what would be best >> to do >> and at least two experts in TCP weighed in on the effect this change might >> have. >> >> Not everything proposed by a developer need go into the tree, in >> particular since these >> discussions are archived we can always revisit this later. >> >> This is exactly how collaborative development should look, whether or not >> the patch >> is integrated now, next week, next year, or ever. > > > I agree that discussion is great, we have all learned quite a bit from it, > about TCP and the dangers of adjusting buffering without considerable > thought. I would not be involved in FreeBSD had this type of discussion and > information not be discussed on the lists so readily. > > However, the end result must be far different than what has occurred so far. > > If the code was deemed unacceptable for general inclusion, then we must find > a way to provide a light framework to accomplish the needs of the community > member. > > Take for instance someone who is starting a company that needs this > facility. Which OS will they choose? One who has integrated a useful > feature? Or one who has rejected it and left that code in the mailing list > archives? > > As much as expert opinion is valuable, it must include understanding and > need of handling special cases and the ability to facilitate those special > cases for our users and developers. This is a subject rather near to my heart, having fought battles with congestion back in the dark days of Windows when it essentially defaulted to TCPIGNOREIDLE. It was a huge pain, but it was the only way Windows did TCP in the early days. It simply did not implement slow-start. This was really evil, but in the days when lots of links were 56K and T-1 was mostly used for network core links, the Internet, small as it was back then, did not melt, though it glowed a frightening shade of red fairly often. Today too many systems running like this would melt thins very quickly. OTOH, I can certainly see cases, like John's, where it would be very beneficial. And, yes, Linux has it. (I don't see this a relevant in any way except as proof tat not enough people have turned it on to cause serious problems... yet!) It seems a shame to make everyone who really has a need develop their own patches or dig though old mail to find John's. What I would like to see is a way to have it available, but make it unlikely to be enabled except in a way that would put up flashing red warnings and sound sirens to warn people that it is very dangerous and can be a way to blow off a few of one's own toes. One idea that popped into my head (and may be completely ridiculous, is to make its availability dependent on a kernel option and have warning in NOTES about it contravening normal and accepted practice and that it can cause serious problems both for yourself and for others using the network. I might also note that almost all higher performance (1G and faster) networks already have a form of this...TSO. In case you hadn't noticed, TSO will take a large buffer and transmit it as multiple segments which are transmitted back to back with NO delay or awareness of congestion. I can confirm that even this limited case can and does sometimes result in packet loss when router queues are inadequate to handle the load. -- R. Kevin Oberman, Network Engineer E-mail: kob6558@gmail.com