From owner-freebsd-hackers Sun Mar 21 10:24:56 1999 Delivered-To: freebsd-hackers@freebsd.org Received: from smtp02.primenet.com (smtp02.primenet.com [206.165.6.132]) by hub.freebsd.org (Postfix) with ESMTP id 3DF9014D8D for ; Sun, 21 Mar 1999 10:24:55 -0800 (PST) (envelope-from tlambert@usr06.primenet.com) Received: (from daemon@localhost) by smtp02.primenet.com (8.8.8/8.8.8) id LAA25350; Sun, 21 Mar 1999 11:24:35 -0700 (MST) Received: from usr06.primenet.com(206.165.6.206) via SMTP by smtp02.primenet.com, id smtpd025329; Sun Mar 21 11:24:32 1999 Received: (from tlambert@localhost) by usr06.primenet.com (8.8.5/8.8.5) id LAA13217; Sun, 21 Mar 1999 11:24:31 -0700 (MST) From: Terry Lambert Message-Id: <199903211824.LAA13217@usr06.primenet.com> Subject: Re: Gigabit ethernet -- what am I doing wrong? To: des@flood.ping.uio.no (Dag-Erling Smorgrav) Date: Sun, 21 Mar 1999 18:24:31 +0000 (GMT) Cc: andreas@klemm.gtn.com, rsnow@lgc.com, hasty@rah.star-gate.com, ckempf@enigami.com, wpaul@skynet.ctr.columbia.edu, freebsd-hackers@FreeBSD.ORG In-Reply-To: from "Dag-Erling Smorgrav" at Mar 17, 99 02:21:44 am X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > > AFAIK "zero copy tcp/ip" went into 3.1 and 4.0. Thanks to David > > Greenman who implemented and tested this on ftp.cdrom.com. > > (I hope I got the credits right ;-) > > No, that's only zero-copy transmission of files over stream sockets. OK, I'm real curious. How does this work? The lowest possible number of copiees I can consider is 1. This assumes a DMA from the disk controller into the ethernet card memory, and a cache-line unaligned one, at that, since the host would have to pre-supply the packet header. The next lowest number would be two, where the copy is into main memory by the disk controller, and then back out to the ethernet controller after packet assembly. You could optimize this a bit by clever use of page and page boundary mapping to back the data up against a page that contained 4k - header size pad bytes, and followed by the header data for the packet in the next page. Unless you were really clever with the paging hardware, this would mean a minumum packet size of 4k, or that you only got two copies on a page aligned buffer start. FWIW, I'm pretty sure that the cleverest you can get with x86 paging hardware is 64b chunks * number of chunks + header size for an MTU. Another common trick (that FreeBSD doesn't need because of the unified VM and buffer cache) is to allow an MBUF to reference a buffer cache page instead of having to copy into it. Again, you have to have offset/length pair encoding for the region, which may result in cache line misses, unless you are careful with the byte alignment in the MTU, not including the header. But even then, there is a copy on packet assembly into the ethernet interface. Having actually played this tricks on VAX hardware myself, I suspect that what you really meant to say was "zero unnecessary copies", not "zero copy", right? If so, I'm betting that there is at least one unnecessary copy, perhaps more, still in there. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message