From owner-freebsd-net@FreeBSD.ORG Fri Sep 5 13:47:30 2003 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D76DD16A4EB for ; Fri, 5 Sep 2003 13:47:29 -0700 (PDT) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9671B43FEA for ; Fri, 5 Sep 2003 13:47:28 -0700 (PDT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.12.9/8.12.9) with ESMTP id h85KlSJV007956 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Fri, 5 Sep 2003 16:47:28 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.11.6/8.9.1) id h85KlMp35691; Fri, 5 Sep 2003 16:47:22 -0400 (EDT) (envelope-from gallatin@cs.duke.edu) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16216.63066.954104.582195@grasshopper.cs.duke.edu> Date: Fri, 5 Sep 2003 16:47:22 -0400 (EDT) To: freebsd-net@freebsd.org X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid Subject: TCP Segmentation Offload X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Sep 2003 20:47:31 -0000 I've been reading a little about TCP Segmentation Offload (aka TSO). We don't appear to support it, but at least 2 of our supported nics (e1000 and bge) apparently could support it. The gist is that TCP pretends the nic has a large mtu, and passes a large (> the mtu on the link layer) packet down to driver and then the nic. The nic then fragments the large packet into smaller (<=mtu) packets. It uses the initial TCP header as a template to construct the headers for the "fragments.". The people who implemented it on linux claim a 50% CPU savings for an Intel 1Gb/s adaptor with a 1500 byte mtu. It seems like it could be implemented rather easily by adding an if_hwassist flag (CSUM_TSO). If this flag is set on the interface found by tcp_mss(), then the mss is set to 56k. This causes TCP to generate huge packets. We then add a check in ip_output() after the (near the existing CSUM_FRAGMENT check) which checks to see if its both a TCP packet, and if CSUM_TSO is set in the if_hwassist flags. If so, the huge packet is passed on down to the driver. Does this sound reasonable? The only other thing I can think of is that some nics might not be able to handle such a large mss, and we might want to stuff the maximum mss value into the ifnet struct. I don't have a bge or an e1000, so I'm not ready to actually implement this. I'm more considering firmware optimizations for our product, and would implement it in a few months, after making the firmware changes. Drew