From owner-freebsd-net@FreeBSD.ORG Sun Jun 3 19:05:25 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CBD1F106566C; Sun, 3 Jun 2012 19:05:25 +0000 (UTC) (envelope-from gallatin@cs.duke.edu) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.freebsd.org (Postfix) with ESMTP id 6B2138FC0A; Sun, 3 Jun 2012 19:05:25 +0000 (UTC) Received: from [192.168.200.2] (c-24-125-204-77.hsd1.va.comcast.net [24.125.204.77]) (authenticated bits=0) by duke.cs.duke.edu (8.14.5/8.14.5) with ESMTP id q53J5Ikn029786 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 3 Jun 2012 15:05:18 -0400 (EDT) X-DKIM: Sendmail DKIM Filter v2.8.3 duke.cs.duke.edu q53J5Ikn029786 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=cs.duke.edu; s=mail; t=1338750318; bh=WcFbGonLnz+xEQDcGOgn9aEtCjLkaoVlZxC3Wk6EhNY=; h=Message-ID:Date:From:MIME-Version:To:CC:Subject:References: In-Reply-To:Content-Type:Content-Transfer-Encoding; b=f/k5PVv/0svHa7vhlXBElHK6rmiGc8aM/VU0UPx1cEfmp1P8UuBobw4yCG8zdCsxc LdJRR7cB93n4JL65+jUy2ulyF5kQviB06Nphyy2pVMSqdSmC3FxxJsbO1Jq+OvpjIw woW77JQ98kFywJQFKOjVqKj19mzjNdd9iIowsL15pXWm70EkN/zGfa7spcP4fBbSGd s2ZFD8CDaO0DkqsqWE8TLa4nQX6/QCxNXursXQBHzkntL/YdnmSONe9cdQ3mGjNEq0 2OLCGSl5Eb7Vc8TK4mLFu0H2F9RWDviM2qPvlkRz3VBiHX5GpZrnWIeHUgXwSxfxbs MH0a1XwkE5lGw== Message-ID: <4FCBB56D.2080800@cs.duke.edu> Date: Sun, 03 Jun 2012 15:05:17 -0400 From: Andrew Gallatin User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20120329 Thunderbird/11.0.1 MIME-Version: 1.0 To: Colin Percival References: <4FC635CC.5030608@freebsd.org> <4FC63D27.70807@cs.duke.edu> <4FCB95F6.30204@freebsd.org> In-Reply-To: <4FCB95F6.30204@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org Subject: Re: [please review] TSO mbuf chain length limiting patch X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Jun 2012 19:05:25 -0000 On 06/03/12 12:51, Colin Percival wrote: > On 05/30/12 08:30, Andrew Gallatin wrote: >> On 05/30/12 10:59, Colin Percival wrote: >>> The Xen virtual network interface has an issue (ok, really the issue is with >>> the linux back-end, but that's what most people are using) where it can't >>> handle scatter-gather writes with lots of pieces, aka. long mbuf chains. >>> This currently bites us hard with TSO enabled, since it produces said long >>> mbuf chains. >> >> I think a better approach would be to have a limit on the size of the >> pre-segmented TCP payload size sent to the driver. I tend to think >> that this would be more generically useful, and it is a better match >> for the NDIS APIs, where a driver must specify the max TSO size. I >> think the changes to the TCP stack might be simpler (eg, they >> would seem to jive better with the existing "maxmtu" approach). >> >> I think this could work for you as well. You could set the Xen max >> tso size to be 32K (derived from 18 pages/skb, multiplied by a typical >> 2KB mbuf size, with some slack built in). If the chain was too large, >> you could m_defrag it down to size. > > I've attached a new patch which: > 1. adds a IFCAP_TSO_MSS "capability" and a if_tx_tso_mss field to struct ifnet, > 2. sets these in netfront when the IFCAP_TSO4 flag is set, > 3. extends tcp_maxmtu to read this value, > 4. adds a tx_tso_mss field to struct tcpcb, > 5. makes tcp_mss_update set tx_tso_mss using tcp_maxmtu, and > 6. limits TSO lengths to tx_tso_mss in tcp_output. > Looks good to me. I don't pretend to understand the bowels of the TCP stack, so I can't comment on the "sendalot" stuff to force segmentation. I assume it works as well as your previous patch to solve the problems you were seeing with Xen? One minor nit is that I envision this limit to always be 64K or less, so you could probably get away with stealing 16 bits from ifcspare. The other trivial nit is that I would have made these new if_tx_tso_mss and t_tx_tso_mss fields unsigned. Thanks so much for doing this! Drew