FreeBSD Mail Archives

Date:      Mon, 31 Dec 2001 15:27:25 -0800
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Matthew Dillon <dillon@apollo.backplane.com>
Cc:        Julian Elischer <julian@elischer.org>, Mike Silbersack <silby@silby.com>, Josef Karthauser <joe@tao.org.uk>, Tomas Svensson <tsn@gbdev.net>, freebsd-hackers@FreeBSD.ORG
Subject:   Re: FreeBSD performing worse than Linux?
Message-ID:  <3C30F45D.88A9EAE2@mindspring.com>
References:  <Pine.BSF.4.21.0112311139180.9721-100000@InterJet.elischer.org> <200112311941.fBVJfvc25822@apollo.backplane.com> <3C30C5F9.BCC1BAE8@mindspring.com> <200112312017.fBVKHTs26004@apollo.backplane.com> <3C30E7B9.96EFA40F@mindspring.com> <200112312244.fBVMigK35786@apollo.backplane.com>

Matthew Dillon wrote:
> 
>     Terry, I give up.  Maybe if you actually tried to go in and fix it
>     you would see what I'm talking about.  For your information:

I don't have a USB/Ethernet adapter.

>     * Julian's work has nothing to do with this particular problem.  It
>       has to do with constricted bandwidth.  This issue has nothing to do
>       with USB's bandwidth limitations.

The specific work in question is rate limiting the *other end* of
a TCP connection.  This would let a client with a problem like
this *slow the server down*, without any code changes on the
server, and without buffer congestion on the server.  You have not
seen this work, I think.

>     * The window size is not negotiated between client and server (other
>       then whether window scaling is used or not, which is non-applicable).
>       Each side's TCP protocol receiver controls the window size it
>       advertises.

So you are saying that if the client advertised a smaller TCP
window, then the server would have less simultaneously oustanding
packets in flight before an ACK was required.

I think you and I are just using different definitions of
"negotiated": the client controls the transmit window which
is available to the server.

So we are agreed: If the client with the USB bogosity were to
advertise a smaller window, then the server would have less
packets in flight; and since the problem appears (to me) to
be that there are more packets in flight from the server than
the USB<->ethernet dongle is capable of buffering to send down
the slow (USB) side of the link, then this would prevent the
server from buffer-overflowing the dongle into losing packets.

It seems to me that this is a clear case of impedence mismatch,
with a fast link trying to jam more packets down a slow link
than is possible, with the ACKs being sent by the dongle,
rather than delayed until it has sufficient buffer space
available.

>     * The window size has no bearing on the number of in-transit packets,
>       unless you make assumptions in regards to the packet size.  That's
>       the crux of the problem faced by the receiver in this case.

This is why I suggested that the PSC rate-halving algorithm would
probably be useful.

The real problem here is that more data is being sent to the
dongle than it is able to forward.  The packet trace bears this
conclusion out.

>     * If you think reducing the receive window on the fly is easy,
>       try it and you will find out that it isn't as easy as you
>       might have thought.

Both Julian's code and the PSC code do this.  I wasn't suggesting
writing the code from scratch.

However, I have to say that I've implicitly done this sort of
limitation in a commercial product, before, anolg with Jeffrey
Hsu, by controlling transmit and receive window sizes in a proxy
server.  Clearly, a web server on a 1G link talking to a client
on the other end of a 28.8k link would quickly overrun a proxy
server's transmit buffers, otherwise, as the received data is
shoved out at a much slower rate.

This failure mode is consistant with the use of a much slower
USB link on the other side of a USB<->ethernet dongle.

The problem here is that the don'e itself is not going to do the
necessary flow rate limiting (it doesn't, or this discussion
would have never happened in the first place), so you have to
trigger it as a secondary effect, like the PSC rate halving, or
the code Julian did to force the advertised window sizes smaller
over time.

Julian's code was exactly analogous to the "dongle" situation,
except the "dongle" was a CISCO router on the other end of a
slow link from an InterJet, where we wanted a single flow to
not monopolize all the transmit buffers on the router, so that
we would stil be able to get data through.  This basically
meant controlling the amount of the pool taken up, times the
pool retention time, on the router, for any given set of flows.

As a side effect, you could intentionally reduce the limit on
the advertised window size, until the flow rate dropped.  This
is basically the balance the TCP Rate Halving code seeks to
achieve (the balance between buffer congestion losses and the
overall throughput).

Please see:

	http://www.psc.edu/networking/rate_halving.html

>     * And if you think you can handle that, now try figuring out, in the
>       receiver, how to dynamically increase and reduce the window based on
>       the size of the packets being received and try to discern the difference
>       between packet loss and a driver problem, to then limit the number
>       of in-transit packets (through a massive window reduction).  Good luck!

Julian already did this, as I stated.

>     * I never said the server was the problem.  To the contrary, I've
>       been saying that the client is the problem.. USB ethernet is
>       broken, period.
> 
>     As I have said, it may be possible to make new-reno on the client
>     (receiver) side to cause the transmit side to close its congestion
>     window (and to keep it closed, or fairly closed).  But I aint gonna
>     be the one to try to do it.

The PSC Rate Halving code, which was first implemented on BSD
(NetBSD 1.3.2):

	http://www.psc.edu/networking/ftp/tools/netbsd132_rh_10.tgz

Already does this.  Here is a partial excerpt:

	Hoe [Hoe95] suggested that during Fast Recovery the TCP data
	sender space out retransmissions and new data on alternate
	acknowledgements across the entire recovery RTT. (Note that
	this eliminates the half RTT lull in sending which occurs in
	Reno TCP.) 

[ We could, in other words, turn New Reno back on without the
  performance loss we were seeing that caused us to turn it off ]

	The Rate-Halving algorithm implements Hoe's idea. The
	algorithm may be implemented in NewReno, SACK, and ECN-style
	TCP implementations. Rate-Halving has a number of other
	useful properties as well. It results in a slightly lower
	final value for cwnd following recovery, which has been
	suggested by Floyd and others as the more correct value.
	The Rate-Halving algorithm provides proper adjustments to
	the congestion window in response to congestion signals
	such as a lost segment or an ECN-Echo bit [RFC2481]. These
[ We are particularly interested in the "lost segment" case here ]
	adjustments are largely independent of the strategy used
	to retransmit missing segments, allowing Rate-Halving to
	be extended for use in other TCP implementations or even
	non-TCP transport protocols.

See also RFC 3148 (we are, in effect, talking about "last hop"
congestion here).

- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3C30F45D.88A9EAE2>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation