FreeBSD Mail Archives

Date:      Thu, 26 Feb 2009 14:53:09 -0800
From:      Chuck Swiger <cswiger@mac.com>
To:        ross.cameron@linuxpro.co.za
Cc:        "freebsd-questions@freebsd.org" <freebsd-questions@freebsd.org>
Subject:   Re: TCP congestion avoidance
Message-ID:  <A2E512AE-D4DF-428B-A743-96B523238EE8@mac.com>
In-Reply-To: <35f70db10902261341g18d1840du3eb2548418f39974@mail.gmail.com>
References:  <35f70db10902260013v25e3f1bfs8f5929d2c62805@mail.gmail.com> <35f70db10902261302y41d6e9e0x9f420dd6f589735b@mail.gmail.com> <DBF0A23A-034B-44C3-993E-FB50A196704E@mac.com> <35f70db10902261341g18d1840du3eb2548418f39974@mail.gmail.com>

On Feb 26, 2009, at 1:41 PM, Ross Cameron wrote:
> Where can I find more documentation on these types of settings in  
> FreeBSD

The FreeBSD Handbook and Google will help for the general case, but  
for specific details, reading the source is recommended.

>         and
> How can I choose between more than just TCP_NewReno, specifically I  
> will be making use of TCP_Westwood / TCP_Westwood+ and  
> TCP_Illinois ???

If you have a BSD-licensed implementation of TCP Westwood or the  
others handy, feel free to contribute your patches in a PR.

At least some of the notions behind the congestion algorithms you've  
mentioned are present in the FreeBSD stack in the form of the  
net.inet.tcp.inflight tunables; see netinet/tcp_subr.c:

/*
  * TCP BANDWIDTH DELAY PRODUCT WINDOW LIMITING
  *
  * This code attempts to calculate the bandwidth-delay product as a
  * means of determining the optimal window size to maximize bandwidth,
  * minimize RTT, and avoid the over-allocation of buffers on  
interfaces and
  * routers.  This code also does a fairly good job keeping RTTs in  
check
  * across slow links like modems.  We implement an algorithm which is  
very
  * similar (but not meant to be) TCP/Vegas.  The code operates on the
  * transmitter side of a TCP connection and so only effects the  
transmit
  * side of the connection.
  *
  * BACKGROUND:  TCP makes no provision for the management of buffer  
space
  * at the end points or at the intermediate routers and switches.  A  
TCP
  * stream, whether using NewReno or not, will eventually buffer as
  * many packets as it is able and the only reason this typically  
works is
  * due to the fairly small default buffers made available for a  
connection
  * (typicaly 16K or 32K).  As machines use larger windows and/or window
  * scaling it is now fairly easy for even a single TCP connection to  
blow-out
  * all available buffer space not only on the local interface, but on
  * intermediate routers and switches as well.  NewReno makes a  
misguided
  * attempt to 'solve' this problem by waiting for an actual failure  
to occur,
  * then backing off, then steadily increasing the window again until  
another
  * failure occurs, ad-infinitum.  This results in terrible  
oscillation that
  * is only made worse as network loads increase and the idea of  
intentionally
  * blowing out network buffers is, frankly, a terrible way to manage  
network
  * resources.
  *
  * It is far better to limit the transmit window prior to the failure
  * condition being achieved.  There are two general ways to do this:   
First
  * you can 'scan' through different transmit window sizes and locate  
the
  * point where the RTT stops increasing, indicating that you have  
filled the
  * pipe, then scan backwards until you note that RTT stops  
decreasing, then
  * repeat ad-infinitum.  This method works in principle but has severe
  * implementation issues due to RTT variances, timer granularity, and
  * instability in the algorithm which can lead to many false  
positives and
  * create oscillations as well as interact badly with other TCP streams
  * implementing the same algorithm.
  *
  * The second method is to limit the window to the bandwidth delay  
product
  * of the link.  This is the method we implement.  RTT variances and  
our
  * own manipulation of the congestion window, bwnd, can potentially
  * destabilize the algorithm.  For this reason we have to stabilize the
  * elements used to calculate the window.  We do this by using the  
minimum
  * observed RTT, the long term average of the observed bandwidth, and
  * by adding two segments worth of slop.  It isn't perfect but it is  
able
  * to react to changing conditions and gives us a very stable basis on
  * which to extend the algorithm.
  */
void
tcp_xmit_bandwidth_limit()

-- 
-Chuck

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A2E512AE-D4DF-428B-A743-96B523238EE8>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation