Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 31 Dec 2001 21:46:50 -0500
From:      "Louis A. Mamakos" <louie@TransSys.COM>
To:        Terry Lambert <tlambert2@mindspring.com>
Cc:        Matthew Dillon <dillon@apollo.backplane.com>, Julian Elischer <julian@elischer.org>, Mike Silbersack <silby@silby.com>, Josef Karthauser <joe@tao.org.uk>, Tomas Svensson <tsn@gbdev.net>, freebsd-hackers@FreeBSD.ORG
Subject:   Re: FreeBSD performing worse than Linux? 
Message-ID:  <200201010246.g012ko721041@whizzo.transsys.com>
In-Reply-To: Your message of "Mon, 31 Dec 2001 18:11:21 PST." <3C311AC9.99B5FC9C@mindspring.com> 
References:  <Pine.BSF.4.21.0112311225150.94344-100000@InterJet.elischer.org> <200112312327.fBVNRt719835@whizzo.transsys.com> <200201010043.g010h0i36281@apollo.backplane.com> <3C311AC9.99B5FC9C@mindspring.com> 

next in thread | previous in thread | raw e-mail | index | archive | help

Disabling Nagle's algorithm for no good reason has very poor
scaling behavior.   This is what happens when TCP_NODELAY is
enabled on a socket.

If you look at the work function for most network elements, the part
that runs out of gas first is per-packet forwarding performance. Sure,
you need to have adequate bus bandwidth to move stuff through a box,
but it's performing per-packet forwarding operations and policy which
is the resource that's most difficult to make more of. I think this is
true for toy routers based on PC platform as well as high-end boxes like
the Cisco 12000 series. Juniper managed adequate forwarding performance
using specialized ASIC implementions in the forwarding path.  Of this
statement, I'm sure; in my day job at UUNET, I talk to all the major
backbone router vendors, and forwarding performance (and also
reasonable routing protocol implementions) is a show-stopper 
requirement they labor mightily over.

So here was have a mechanism with wonderful properties - it's a
trivial yet clever implementation of a self tuning mechanism to
prevent tinygrams from being generated by a TCP without all manner
of complicated timers.  It give great performance on LAN and other
high-speed interconnects where remote echo type applications are
demanding, yet over long delay paths where remote echo is gonna suck
no matter what you do, it automatically aggregates packets.

Nagle's algorithm and Van Jacobson's slow-start algorithm allowed the 
Internet to survive over congested paths.  And they did so with
a bunch of self-tuning behavior independent of the bandwidth*delay
product of the path the connection was running over.  It was and is
amazing stuff.

Likewise, the original problem in this thread is likely caused by some
part of the USB Ethernet implementation having inadequate per-packet
resources. It's probably not about the number of bytes, but the number of
transactions.  You see here a modern reimplementation of essentially the same
problem that the 3COM 3C501 ISA ethernet card had 15 years ago - back to
back packets were consistantly dropped because of the poor per-packet
buffering implementation.  It was absolutely repeatable.

Sure, it's "legal" to generate streams of tinygrams and not use Nagle's
algorithm to aggregate the sender's traffic, but it's just plain rude
and on low bandwidth links, it sucks because of all the extra 40 byte
headers you're carrying around.

I'm sure TCP_NODELAY got added because it sounds REALLY C00L to make 
the interactive thing go better.  But clearly people don't understand
the impact of turning on the cleverly named option and how it probably
doesn't really improve things.

louie

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200201010246.g012ko721041>