Date: Thu, 26 Feb 2009 14:53:09 -0800 From: Chuck Swiger <cswiger@mac.com> To: ross.cameron@linuxpro.co.za Cc: "freebsd-questions@freebsd.org" <freebsd-questions@freebsd.org> Subject: Re: TCP congestion avoidance Message-ID: <A2E512AE-D4DF-428B-A743-96B523238EE8@mac.com> In-Reply-To: <35f70db10902261341g18d1840du3eb2548418f39974@mail.gmail.com> References: <35f70db10902260013v25e3f1bfs8f5929d2c62805@mail.gmail.com> <35f70db10902261302y41d6e9e0x9f420dd6f589735b@mail.gmail.com> <DBF0A23A-034B-44C3-993E-FB50A196704E@mac.com> <35f70db10902261341g18d1840du3eb2548418f39974@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Feb 26, 2009, at 1:41 PM, Ross Cameron wrote: > Where can I find more documentation on these types of settings in > FreeBSD The FreeBSD Handbook and Google will help for the general case, but for specific details, reading the source is recommended. > and > How can I choose between more than just TCP_NewReno, specifically I > will be making use of TCP_Westwood / TCP_Westwood+ and > TCP_Illinois ??? If you have a BSD-licensed implementation of TCP Westwood or the others handy, feel free to contribute your patches in a PR. At least some of the notions behind the congestion algorithms you've mentioned are present in the FreeBSD stack in the form of the net.inet.tcp.inflight tunables; see netinet/tcp_subr.c: /* * TCP BANDWIDTH DELAY PRODUCT WINDOW LIMITING * * This code attempts to calculate the bandwidth-delay product as a * means of determining the optimal window size to maximize bandwidth, * minimize RTT, and avoid the over-allocation of buffers on interfaces and * routers. This code also does a fairly good job keeping RTTs in check * across slow links like modems. We implement an algorithm which is very * similar (but not meant to be) TCP/Vegas. The code operates on the * transmitter side of a TCP connection and so only effects the transmit * side of the connection. * * BACKGROUND: TCP makes no provision for the management of buffer space * at the end points or at the intermediate routers and switches. A TCP * stream, whether using NewReno or not, will eventually buffer as * many packets as it is able and the only reason this typically works is * due to the fairly small default buffers made available for a connection * (typicaly 16K or 32K). As machines use larger windows and/or window * scaling it is now fairly easy for even a single TCP connection to blow-out * all available buffer space not only on the local interface, but on * intermediate routers and switches as well. NewReno makes a misguided * attempt to 'solve' this problem by waiting for an actual failure to occur, * then backing off, then steadily increasing the window again until another * failure occurs, ad-infinitum. This results in terrible oscillation that * is only made worse as network loads increase and the idea of intentionally * blowing out network buffers is, frankly, a terrible way to manage network * resources. * * It is far better to limit the transmit window prior to the failure * condition being achieved. There are two general ways to do this: First * you can 'scan' through different transmit window sizes and locate the * point where the RTT stops increasing, indicating that you have filled the * pipe, then scan backwards until you note that RTT stops decreasing, then * repeat ad-infinitum. This method works in principle but has severe * implementation issues due to RTT variances, timer granularity, and * instability in the algorithm which can lead to many false positives and * create oscillations as well as interact badly with other TCP streams * implementing the same algorithm. * * The second method is to limit the window to the bandwidth delay product * of the link. This is the method we implement. RTT variances and our * own manipulation of the congestion window, bwnd, can potentially * destabilize the algorithm. For this reason we have to stabilize the * elements used to calculate the window. We do this by using the minimum * observed RTT, the long term average of the observed bandwidth, and * by adding two segments worth of slop. It isn't perfect but it is able * to react to changing conditions and gives us a very stable basis on * which to extend the algorithm. */ void tcp_xmit_bandwidth_limit() -- -Chuck
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A2E512AE-D4DF-428B-A743-96B523238EE8>