From owner-freebsd-hackers  Fri Sep  6 12: 9:37 2002
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 37B7637B400
	for <hackers@freebsd.org>; Fri,  6 Sep 2002 12:09:32 -0700 (PDT)
Received: from pintail.mail.pas.earthlink.net (pintail.mail.pas.earthlink.net [207.217.120.122])
	by mx1.FreeBSD.org (Postfix) with ESMTP id BCAB943E42
	for <hackers@freebsd.org>; Fri,  6 Sep 2002 12:09:31 -0700 (PDT)
	(envelope-from tlambert2@mindspring.com)
Received: from pool0483.cvx22-bradley.dialup.earthlink.net ([209.179.199.228] helo=mindspring.com)
	by pintail.mail.pas.earthlink.net with esmtp (Exim 3.33 #1)
	id 17nOTv-0003pq-00; Fri, 06 Sep 2002 12:09:28 -0700
Message-ID: <3D78FD1E.EAA7ABD7@mindspring.com>
Date: Fri, 06 Sep 2002 12:08:14 -0700
From: Terry Lambert <tlambert2@mindspring.com>
X-Mailer: Mozilla 4.79 [en] (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
To: Darren Pilgrim <dmp@pantherdragon.org>
Cc: Dan Ellard <ellard@eecs.harvard.edu>, hackers@FreeBSD.ORG
Subject: Re: gigabit NIC of choice?
References: <Pine.BSF.4.44.0209061151560.35790-100000@bowser.eecs.harvard.edu> <3D78E69C.4152CC8@mindspring.com> <3D78F17F.5BE2499B@pantherdragon.org>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-hackers.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-hackers>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-hackers>
X-Loop: FreeBSD.ORG

Darren Pilgrim wrote:
> Terry Lambert wrote:
> > Dan Ellard wrote:
> > > What's the gigabit ethernet NIC of choice these days?  (I've had good
> > > experiences with the NetGear G620T, but apparently this card is no
> > > longer being sold.)
> >
> > The Tigon II has the best performances, but that's because
> > software people rewrote the firmware, instead of hardware
> > engineers moonlighting as programmers.  8-) 8-).
> 
> I recall from a while back that gigabit cards have "relatively" large
> caches on them, correct?  How does the size of the cache impact
> performance, and what is considered a sufficient cache size?

The best advice I have for you is to read the source code for
the drivers, specifically any commentary by Bill Paul up top;
he tells it like it is, with regard to the hardware.


In general, cards with DMA engines that require better than two
byte alignment require that the mbufs be copied again for transmit.

Also, in general, the more queue descriptors, the better, since
they limit the number of packets pending input or output that you
can have outstanding simultaneously.

Controllers that can't do scatter/gather are also problematic,
because they mean you have to allocate a seperate buffer area
out of memory and copy outbound data into thue buffer instead of
scattering, and copy from the buffer to mbufs on the receive
(gather).

The smaller the amount of memory on the card, the worse things
are, as well, because it limits the amount of data you can
have outstanding, as well, which limits your throuput.

Bad cards are also not capable of software interrupt coelescing
(this was one of my contributions).  Basically, what this means
is that a card will not DMA, or does not have a "modified"
register, or does not update it, while an interuppt is being
processed (e.g. after the interrupt is raised in hardware, and
has not yet been ACKed).  The effect of this is that you can't
poll at the end of the interrupt handler for new data, only
exitting the handler when there is no new data to process (10
to 15% performance inmprovement, by my benchmarks).

Bad cards will also have smaller on-chip buffers (as opposed
to on-card buffers).  For example, there are a number of cards
that supposedly support both "jumbograms" and TCP checksum
offloading, but have only 8K of space.  A "jumbogram" is 9K,
so when using jumbograms, it's impossible to offload checksums
to the hardware.

There are cards that supposedly support checksumming, but use
the buggy incremental checksum update algorithm (two's
complement vs. one's complement arithmatic), and will screw up
the TCP checksum, yielding 0xfffe instead of 0x0000 after summing,
because they don't correctly handle negative zero (there is an
RFC update on this).

A really good card will allow you to align card buffers to host
page boundaries, which can dignificantly speed up I/O.  This is
what I was referring to when I said there was a rewritten firmware
for the Tigon II.  The manufacturer won't reall share sufficient
information for this interface to be implemented on the Tigon III.
Basically, it eliminates another copy.

The absolute worst one (according to Bill Paul) is the RealTek
8129/8139.  See the comments in /usr/src/sys/pci/if_rl.c.

Mostly, if you go by the comments in the drivers, you'll get a
feel for what's done right and what's done wrong from a host
interface perspective by the card manufacturer.


As to your cache question... the size of the cache is the pool
size.  If you look at this as a queueing theory problem, then
amount of buffer space translates directly into how much it's
willing to tolerate delays in servicing interrupts -- pool
retention time.

Above a certain size, and it really won't effect your ability
to shove data through it because there will be more and more
free space available.  Unless you are going card-to-card
(unlikely; most firmware doesn't support the necessary ability
to do incremental header rewriting, and flow monitoring, so
that you can mark flows without in-band data that needs to be
rewritten e.g. text IP addresses in FTP "port" commands, etc.),
you will always end up with a certain amount of buffer space
free, because the limiting factor is going to be your ability
to shovel data over the PCI bus from the disk to main memory
and back over the same bus to the network card.

So my flip answer seems flip, but to get the best overall
performance, you should use a Tigon II with the FreeBSD
specific firmware, and the zero copy TCP patches that need
the firmware patches.

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message