From owner-freebsd-chat  Wed Jul 17 14:18:04 1996
Return-Path: owner-chat
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.5/8.7.3) id OAA21080
          for chat-outgoing; Wed, 17 Jul 1996 14:18:04 -0700 (PDT)
Received: from antares.aero.org (antares.aero.org [130.221.192.46])
          by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id OAA21075
          for <freebsd-chat@freebsd.org>; Wed, 17 Jul 1996 14:17:59 -0700 (PDT)
Received: from anpiel.aero.org (anpiel.aero.org [130.221.196.66]) by antares.aero.org (8.7.5/8.7.3) with SMTP id OAA24743 for <freebsd-chat@freebsd.org>; Wed, 17 Jul 1996 14:17:28 -0700 (PDT)
Message-Id: <199607172117.OAA24743@antares.aero.org>
To: freebsd-chat@freebsd.org
Subject: Van Speaks
Date: Wed, 17 Jul 1996 14:17:25 -0700
From: "Mike O'Brien" <obrien@antares.aero.org>
Sender: owner-chat@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

	Van Jacobson doesn't post much, but when he does, it's a doozy.  Those who
are getting all flustered about TCP performance (and especially "TCP Vegas") would
do well to read the following.

------- Forwarded Message

Received: from aerospace.aero.org (aerospace.aero.org [130.221.192.10]) by antares.aero.org (8.7.5/8.7.3) with ESMTP id UAA12307 for <obrien@antares.aero.org>; Tue, 16 Jul 1996 20:50:46 -0700 (PDT)
Received: from aero.org (aero.org [130.221.16.2]) by aerospace.aero.org (8.7.5/8.7.3) with ESMTP id UAA23697 for <obrien@AEROSPACE.AERO.ORG>; Tue, 16 Jul 1996 20:50:44 -0700 (PDT)
Received: from ietf.org ([132.151.1.19]) by aero.org with SMTP id <111113-2>; Tue, 16 Jul 1996 20:50:04 -0700
Received: from ietf.org by ietf.org id aa06440; 16 Jul 96 23:47 EDT
Received: from ietf.cnri.reston.va.us by ietf.org id aa03579;
          16 Jul 96 23:44 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa00787;
          16 Jul 96 23:44 EDT
Received: from rx7.ee.lbl.gov by CNRI.Reston.VA.US id aa00606;
          16 Jul 96 23:44 EDT
Received: by rx7.ee.lbl.gov (8.6.12/1.43r)
	id UAA17867; Tue, 16 Jul 1996 20:45:05 -0700
Message-Id: <199607170345.UAA17867@rx7.ee.lbl.gov>
To: Jon Crowcroft <J.Crowcroft@cs.ucl.ac.uk>
cc: iesg@ietf.org, ietf@CNRI.Reston.VA.US
Subject: Re: Last Call: TCP Slow Start, Congestion Avoidance, Fast Retransmit, and Fast Recovery Algorithms to BCP
In-reply-to: Your message of Tue, 16 Jul 96 06:45:42 BST.
Date: 	Tue, 16 Jul 1996 20:45:04 -0700
Sender: ietf-request@ietf.org
From: Van Jacobson <van@ee.lbl.gov>
Source-Info:  From (or Sender) name not authenticated.

Jon,

> that there are limits to its effectiveness (e.g. current
> algorithm in presence of drop tail fifo routers has minium
> effective rate of 1 packet per RTT

You've said this on a number of occasions lately and I think you
are giving it rather too much emphasis.  The congestion control
and timer algorithms were designed together and work in concert.
The core of TCP's scalability is it's very conservative, very
adaptable retransmit timer.  Two things happen in the presense
of congestion:  queues increase rapidly which increases the RTT
seen by users of the path (the biased mean+variance estimator
tracks these changes and avoids further inflating the queue with
spurious retransmits), and bottleneck link(s) can't handle the
agregate input rate and packets get dropped (the exponential
backoff in the retransmit timer(s) causes the input rate to drop
to the point where it fits in the bottleneck bandwidth).  This
exponential rate backoff via the retransmit timer is an integral
part of the TCP adaptation algorithm (in fact, I feel it's the
primary part & the window adjustment stuff is a second order
performance tweak).  If you were to look at a TCP's behavior as
you slowly lowered the available bandwidth, you would find it
varied fairly smoothly down to arbitrarily low rates (as opposed
to your mental model which seems to have things completely fall
apart when the bandwidth-delay product falls below 1 packet) --
all that happens is that TCP modulates the window when the BDP
is larger than a packet then switches to modulating the time
between packets.  To anticipate a question, no, the packet drop
behavior doesn't change substantially between these two regimes.
If you were to plot the number of drops as a function of
available bandwidth, it would also vary fairly smoothly over the
entire range.

The problems with running in this timer controlled regime are
that it's very unfair (because of something nearly identical to
the ethernet capture effect) and bandwidth upstream of the
bottleneck(s) is wasted transporting packets that will be
dropped at the bottleneck (a very serious scaling problem in a
general mesh network but a problem that people designing
braindead "loss preference" machinery seem to ignore).  But
that doesn't mean it doesn't work -- it deals with congestion
quite as well as modulating the window and, in the absense of
something like RED in the gateways, is only slightly less fair
(window modulation is also unfair because of an autocatalysis
effect).  RED makes either the window or timer scheme fair.

I think the essense of reliable protocol design is getting the
timers right.  If you do, the protocol will probably work &
scale (though there may be lots of things you'll have to tweak
to get good performance).  If you botch the timers, the protocol
is guaranteed to fall apart at some scale (but, unfortunately,
people are very bad about anticipating the effects of scale & a
lot of these bench-top grad student projects end up escaping &
causing no end of suffering for their users before they die).  I
think experience has shown that the TCP designers did a
remarkably good job on the timers (contemporaries such as X.25 &
TP-4 completely botched them).  It's important to remember that
they're the protocol's most basic defense against congestion and
treat them with respect (and occasionally defend them against
poorly conceived, destablizing "improvements" like the
collection of mistakes in Arizona's "tcp vegas").

 - Van

------- End of Forwarded Message