From owner-freebsd-hackers  Sun Jul 15 10: 5:27 2001
Delivered-To: freebsd-hackers@freebsd.org
Received: from earth.backplane.com (earth-nat-cw.backplane.com [208.161.114.67])
	by hub.freebsd.org (Postfix) with ESMTP id 3B71B37B407
	for <hackers@FreeBSD.ORG>; Sun, 15 Jul 2001 10:05:23 -0700 (PDT)
	(envelope-from dillon@earth.backplane.com)
Received: (from dillon@localhost)
	by earth.backplane.com (8.11.4/8.11.2) id f6FH5Gd08326;
	Sun, 15 Jul 2001 10:05:16 -0700 (PDT)
	(envelope-from dillon)
Date: Sun, 15 Jul 2001 10:05:16 -0700 (PDT)
From: Matt Dillon <dillon@earth.backplane.com>
Message-Id: <200107151705.f6FH5Gd08326@earth.backplane.com>
To: Leo Bicknell <bicknell@ufp.org>
Cc: Julian Elischer <julian@elischer.org>,
	Leo Bicknell <bicknell@ufp.org>, Drew Eckhardt <drew@PoohSticks.ORG>,
	hackers@FreeBSD.ORG
Subject: Re: Network performance tuning.
References: <200107130128.f6D1SFE59148@earth.backplane.com> <200107130217.f6D2HET67695@revolt.poohsticks.org> <20010712223042.A77503@ussenterprise.ufp.org> <200107131708.f6DH8ve65071@earth.backplane.com> <3B515097.6551A530@elischer.org> <20010715103334.A64293@ussenterprise.ufp.org>
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-hackers.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-hackers>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-hackers>
X-Loop: FreeBSD.ORG


:Packet loss is not always a bad thing.  Let me use an admittedly
:extreme example:
:
:Consider a backup server across country from four machines it's
:trying to back up nightly.  So we have high (let's say 70ms) RTT's,
:and let's say for the sake of argument the limiting factor is a
:DS-3 in the middle, 45 MBits/sec.
:
:Each connection can get 16384 * 1000 / 70 = 234057 bytes/sec, or
:about 1.87 Mbits/sec.  Multiply by the 4 machines, and we get
:network utilization of 7.48 Mbits/sec, about 16% of the DS-3.
:
:Now, we implement some sort of code that can increase the amount
:of socket buffering space.  As a result, the window can grow (per
:connection) large enough to fill a DS-3, so the 4 hosts must fight
:for the bandwidth available.
:
:I don't have any great math for how we get here, but TCP in normal
:situations rarely produces more than 5% packet loss (10% absolute
:max), since it backs off when congestion occurs.  I'll go with 5%
:as an upper bound.  With that packet loss, TCP now gets the DS-3
:much closer to full, let's say 90%, or 40.5 Mbits/sec (it should
:be higher than 90%, but again, I'm worst casing).  In the aggregate
:that will be spread across the 4 connections evenly, or 10.12
:Mbits/sec per connection.
:
:The question to be asked is, which is better, 1.87 MBit's sec with
:no packet loss, or 10.12 Mbits/sec w/5% packet loss.  Clearly the
:latter gives better performance, even with packet loss.
:
:Clearly knowing the end to end link bandwidth and 'just' filling it
:would be better, but packet loss, at least in the concept of TCP
:flow control is not all bad.  Something else to remember is not
:everyone plays fair, so if we stay to 80% of available, and everyone
:else pushes to packet loss we will in general be pushed out.
:
:-- 
:Leo Bicknell - bicknell@ufp.org

    Well, 4 connections isn't enough to generate packet loss.  All
    that happens is that routers inbetween start buffering the packets.
    If you had a *huge* tcp window size then the routers inbetween could
    run out of packet space and then packet loss would start to occur.
    Routers tend to have a lot of buffer space, though.  The real killer
    is run-away latencies rather then packet loss.

    On the other hand, something like the experimental bandwidth delay
    product code I posted would do very well running 4 connections over
    such a link, because it would detect the point where the routers 
    start buffering the data (by noticing the increased latency) and 
    back-off before the packet loss occured.  It doesn't care how
    many connections are running in parallel.  The downside is that the
    algorithm becomes less stable as you increase the number of
    connections going between the same two end points.  The stability
    in the face of lots of parallel connections is something that needs
    to be tested.

    Also, the algorithm is less helpful when it has to figure out the
    optimal transmit buffer size for every new connection (consider a web
    server).  I am considering ripping out the ssthresh junk from the stack,
    which does not work virtually at all, and using the route table's
    ssthresh field to set the initial buffer size for the algorithm.

					-Matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message