From owner-freebsd-hackers Fri Sep 6 12: 9:37 2002 Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 37B7637B400 for ; Fri, 6 Sep 2002 12:09:32 -0700 (PDT) Received: from pintail.mail.pas.earthlink.net (pintail.mail.pas.earthlink.net [207.217.120.122]) by mx1.FreeBSD.org (Postfix) with ESMTP id BCAB943E42 for ; Fri, 6 Sep 2002 12:09:31 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0483.cvx22-bradley.dialup.earthlink.net ([209.179.199.228] helo=mindspring.com) by pintail.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 17nOTv-0003pq-00; Fri, 06 Sep 2002 12:09:28 -0700 Message-ID: <3D78FD1E.EAA7ABD7@mindspring.com> Date: Fri, 06 Sep 2002 12:08:14 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Darren Pilgrim Cc: Dan Ellard , hackers@FreeBSD.ORG Subject: Re: gigabit NIC of choice? References: <3D78E69C.4152CC8@mindspring.com> <3D78F17F.5BE2499B@pantherdragon.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Darren Pilgrim wrote: > Terry Lambert wrote: > > Dan Ellard wrote: > > > What's the gigabit ethernet NIC of choice these days? (I've had good > > > experiences with the NetGear G620T, but apparently this card is no > > > longer being sold.) > > > > The Tigon II has the best performances, but that's because > > software people rewrote the firmware, instead of hardware > > engineers moonlighting as programmers. 8-) 8-). > > I recall from a while back that gigabit cards have "relatively" large > caches on them, correct? How does the size of the cache impact > performance, and what is considered a sufficient cache size? The best advice I have for you is to read the source code for the drivers, specifically any commentary by Bill Paul up top; he tells it like it is, with regard to the hardware. In general, cards with DMA engines that require better than two byte alignment require that the mbufs be copied again for transmit. Also, in general, the more queue descriptors, the better, since they limit the number of packets pending input or output that you can have outstanding simultaneously. Controllers that can't do scatter/gather are also problematic, because they mean you have to allocate a seperate buffer area out of memory and copy outbound data into thue buffer instead of scattering, and copy from the buffer to mbufs on the receive (gather). The smaller the amount of memory on the card, the worse things are, as well, because it limits the amount of data you can have outstanding, as well, which limits your throuput. Bad cards are also not capable of software interrupt coelescing (this was one of my contributions). Basically, what this means is that a card will not DMA, or does not have a "modified" register, or does not update it, while an interuppt is being processed (e.g. after the interrupt is raised in hardware, and has not yet been ACKed). The effect of this is that you can't poll at the end of the interrupt handler for new data, only exitting the handler when there is no new data to process (10 to 15% performance inmprovement, by my benchmarks). Bad cards will also have smaller on-chip buffers (as opposed to on-card buffers). For example, there are a number of cards that supposedly support both "jumbograms" and TCP checksum offloading, but have only 8K of space. A "jumbogram" is 9K, so when using jumbograms, it's impossible to offload checksums to the hardware. There are cards that supposedly support checksumming, but use the buggy incremental checksum update algorithm (two's complement vs. one's complement arithmatic), and will screw up the TCP checksum, yielding 0xfffe instead of 0x0000 after summing, because they don't correctly handle negative zero (there is an RFC update on this). A really good card will allow you to align card buffers to host page boundaries, which can dignificantly speed up I/O. This is what I was referring to when I said there was a rewritten firmware for the Tigon II. The manufacturer won't reall share sufficient information for this interface to be implemented on the Tigon III. Basically, it eliminates another copy. The absolute worst one (according to Bill Paul) is the RealTek 8129/8139. See the comments in /usr/src/sys/pci/if_rl.c. Mostly, if you go by the comments in the drivers, you'll get a feel for what's done right and what's done wrong from a host interface perspective by the card manufacturer. As to your cache question... the size of the cache is the pool size. If you look at this as a queueing theory problem, then amount of buffer space translates directly into how much it's willing to tolerate delays in servicing interrupts -- pool retention time. Above a certain size, and it really won't effect your ability to shove data through it because there will be more and more free space available. Unless you are going card-to-card (unlikely; most firmware doesn't support the necessary ability to do incremental header rewriting, and flow monitoring, so that you can mark flows without in-band data that needs to be rewritten e.g. text IP addresses in FTP "port" commands, etc.), you will always end up with a certain amount of buffer space free, because the limiting factor is going to be your ability to shovel data over the PCI bus from the disk to main memory and back over the same bus to the network card. So my flip answer seems flip, but to get the best overall performance, you should use a Tigon II with the FreeBSD specific firmware, and the zero copy TCP patches that need the firmware patches. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message