Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 14 Feb 2013 01:01:20 +1100
From:      Lawrence Stewart <lstewart@freebsd.org>
To:        Kevin Oberman <kob6558@gmail.com>
Cc:        Alfred Perlstein <bright@mu.org>, Randall Stewart <rrs@lakerest.net>, John Baldwin <jhb@freebsd.org>, net@freebsd.org
Subject:   Re: [PATCH] Add a new TCP_IGNOREIDLE socket option
Message-ID:  <511B9CB0.4090906@freebsd.org>
In-Reply-To: <CAN6yY1uX__JDEk9dLdJr3pdE1u848jaF_jTn%2B_mrP05bXqm_Pw@mail.gmail.com>
References:  <201301221511.02496.jhb@freebsd.org> <50FF06AD.402@networx.ch> <061B4EA5-6A93-48A0-A269-C2C3A3C7E77C@lakerest.net> <201302060746.43736.jhb@freebsd.org> <511292C9.4040307@mu.org> <E6BF2B74-175F-49D9-B480-8941294D2E19@neville-neil.com> <51166019.9040104@mu.org> <CAN6yY1uX__JDEk9dLdJr3pdE1u848jaF_jTn%2B_mrP05bXqm_Pw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 02/10/13 16:05, Kevin Oberman wrote:
> On Sat, Feb 9, 2013 at 6:41 AM, Alfred Perlstein <bright@mu.org> wrote:
>> On 2/7/13 12:04 PM, George Neville-Neil wrote:
>>>
>>> On Feb 6, 2013, at 12:28 , Alfred Perlstein <bright@mu.org> wrote:
>>>
>>>> On 2/6/13 4:46 AM, John Baldwin wrote:
>>>>>
>>>>> On Wednesday, February 06, 2013 6:27:04 am Randall Stewart wrote:
>>>>>>
>>>>>> John:
>>>>>>
>>>>>> A burst at line rate will *often* cause drops. This is because
>>>>>> router queues are at a finite size. Also such a burst (especially
>>>>>> on a long delay bandwidth network) cause your RTT to increase even
>>>>>> if there is no drop which is going to hurt you as well.
>>>>>>
>>>>>> A SHOULD in an RFC says you really really really really need to do it
>>>>>> unless there is some thing that makes you willing to override it. It is
>>>>>> slight wiggle room.
>>>>>>
>>>>>> In this I agree with Andre, we should not be *not* doing it. Otherwise
>>>>>> folks will be turning this on and it is plain wrong. It may be fine
>>>>>> for your network but I would not want to see it in FreeBSD.
>>>>>>
>>>>>> In my testing here at home I have put back into our stack max-burst.
>>>>>> This
>>>>>> uses Mark Allman's version (not Kacheong Poon's) where you clamp the
>>>>>> cwnd at
>>>>>> no more than 4 packets larger than your flight. All of my testing
>>>>>> high-bw-delay or lan has shown this to improve TCP performance. This
>>>>>> is because it helps you avoid bursting out so many packets that you
>>>>>> overflow
>>>>>> a queue.
>>>>>>
>>>>>> In your long-delay bw link if you do burst out too many (and you never
>>>>>> know how many that is since you can not predict how full all those
>>>>>> MPLS queues are or how big they are) you will really hurt yourself even
>>>>>> worse.
>>>>>> Note that generally in Cisco routers the default queue size is
>>>>>> somewhere between
>>>>>> 100-300 packets depending on the router.
>>>>>
>>>>> Due to the way our application works this never happens, but I am fine
>>>>> with
>>>>> just keeping this patch private.  If there are other shops that need
>>>>> this they
>>>>> can always dig the patch up from the archives.
>>>>>
>>>> This is yet another time when I'm sad about how things happen in FreeBSD.
>>>>
>>>> A developer come forward with a non-default option that's very useful for
>>>> some specific workloads, specifically one that contributes much time and $$$
>>>> to the project and the community rejects the patches even though it's been
>>>> successful in other OSes.
>>>>
>>>> It makes zero sense.
>>>>
>>>> John, can you repost the patch?  Maybe there is a way to refactor this
>>>> somehow so it's like accept filters where we can plug in a hook for TCP?
>>>>
>>>> I am very disappointed, but not surprised.
>>>>
>>> I take away the complete opposite feeling.  This is how we work through
>>> these issues.
>>> It's clear from the discussion that this need not be a default in the
>>> system,
>>> and is a special case.  We had a reasoned discussion of what would be best
>>> to do
>>> and at least two experts in TCP weighed in on the effect this change might
>>> have.
>>>
>>> Not everything proposed by a developer need go into the tree, in
>>> particular since these
>>> discussions are archived we can always revisit this later.
>>>
>>> This is exactly how collaborative development should look, whether or not
>>> the patch
>>> is integrated now, next week, next year, or ever.
>>
>>
>> I agree that discussion is great, we have all learned quite a bit from it,
>> about TCP and the dangers of adjusting buffering without considerable
>> thought.  I would not be involved in FreeBSD had this type of discussion and
>> information not be discussed on the lists so readily.
>>
>> However, the end result must be far different than what has occurred so far.
>>
>> If the code was deemed unacceptable for general inclusion, then we must find
>> a way to provide a light framework to accomplish the needs of the community
>> member.
>>
>> Take for instance someone who is starting a company that needs this
>> facility.  Which OS will they choose?  One who has integrated a useful
>> feature?  Or one who has rejected it and left that code in the mailing list
>> archives?
>>
>> As much as expert opinion is valuable, it must include understanding and
>> need of handling special cases and the ability to facilitate those special
>> cases for our users and developers.
> 
> This is a subject rather near to my heart, having fought battles with
> congestion back in the dark days of Windows when it essentially
> defaulted to TCPIGNOREIDLE. It was a huge pain, but it was the only
> way Windows did TCP in the early days. It simply did not implement
> slow-start. This was really evil, but in the days when lots of links
> were 56K and T-1 was mostly used for network core links, the Internet,
> small as it was back then, did not melt, though it glowed a
> frightening shade of red fairly often. Today too many systems running
> like this would melt thins very quickly.
> 
> OTOH, I can certainly see cases, like John's,  where it would be very
> beneficial. And, yes, Linux has it. (I don't see this a relevant in
> any way except as proof tat not enough people have turned it on to
> cause serious problems... yet!) It seems a shame to make everyone who
> really has a need develop their own patches or dig though old mail to
> find John's.
> 
> What I would like to see is a way to have it available, but make it
> unlikely to be enabled except in a way that would put up flashing red
> warnings and sound sirens to warn people that it is very dangerous and
> can be a way to blow off a few of one's own toes.
> 
> One idea that popped into my head (and may be completely ridiculous,
> is to make its availability dependent on a kernel option and have
> warning in NOTES about it contravening normal and accepted practice
> and that it can cause serious problems both for yourself and for
> others using the network.

Agreed. A sysctl as suggested by Grenville might be sufficient though.
Requiring a full kernel recompile seems a bit draconian.

> I might also note that almost all higher performance (1G and faster)
> networks already have a form of this...TSO. In case you hadn't
> noticed, TSO will take a large buffer and transmit it as multiple
> segments which are transmitted back to back with NO delay or awareness
> of congestion. I can confirm that even this limited case can and does
> sometimes result in packet loss when router queues are inadequate to
> handle the load.

You nailed it - took the words right off my finger tips. Sure, a flow's
cwnd can exceed the TSO max chunk size by an order of magnitude, but the
fact remains that we live in a bursty world already. As much as I
dislike TSO in its current incarnation, it exists for good reason. We
need to provide useful tools, thorough documentation and set sensible
defaults.

Cheers,
Lawrence



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?511B9CB0.4090906>