From owner-freebsd-net@FreeBSD.ORG Fri Jan 31 03:53:05 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0C36A726; Fri, 31 Jan 2014 03:53:05 +0000 (UTC) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id D654E154A; Fri, 31 Jan 2014 03:53:04 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id s0V3r3HF029166 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 30 Jan 2014 19:53:03 -0800 (PST) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id s0V3r3QN029165; Thu, 30 Jan 2014 19:53:03 -0800 (PST) (envelope-from jmg) Date: Thu, 30 Jan 2014 19:53:03 -0800 From: John-Mark Gurney To: Rick Macklem Subject: Re: 64K NFS I/O generates a 34mbuf list for TCP which breaks TSO Message-ID: <20140131035303.GT93141@funkthat.com> Mail-Followup-To: Rick Macklem , Adrian Chadd , FreeBSD Net References: <1856284835.584005.1391139152133.JavaMail.root@uoguelph.ca> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1856284835.584005.1391139152133.JavaMail.root@uoguelph.ca> User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Thu, 30 Jan 2014 19:53:03 -0800 (PST) Cc: FreeBSD Net , Adrian Chadd X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Jan 2014 03:53:05 -0000 Rick Macklem wrote this message on Thu, Jan 30, 2014 at 22:32 -0500: > Adrian Chadd wrote: > > On 30 January 2014 07:06, Rick Macklem wrote: > > > Hi, just adding one more idea on what to do about this > > > to the list: > > > - Add a if_hw_tsomaxseg and modify the loop in tcp_output() > > > so that it uses both if_hw_tsomax and if_hw_tsomaxseg to > > > decide how much to hand to the device driver in each mbuf list. > > > (I haven't looked to see how easy it would be to change this > > > loop.) > > > > I don't think that's a hack. I think adding that and setting > > tsomaxseg > > to say 30 for now would be a good comprimise. > > > Well, my TCP is very rusty and I have no way to test it (I don't > have anything that does TSO), but I've attached a stab at a patch > to do this. > > Maybe it can be used as a starting point for this, if others think > it makes sense. > > The "#ifdef notyet" in the patch would become something like: > # if __FreeBSD_version >= NNNN > when a change to add if_hw_tsomaxseg is done, was what I was > thinking. Definately need to make sure you fix the drivers that support large enough sg arrays like ixgb which supports 100... Just a sampling of ones that use a _SCATTER define: ./e1000/if_igb.h:#define IGB_MAX_SCATTER 64 ./e1000/if_lem.h:#define EM_MAX_SCATTER 64 ./e1000/if_em.h:#define EM_MAX_SCATTER 32 ./nfe/if_nfereg.h:#define NFE_MAX_SCATTER 32 ./ixgbe/ixgbe.h:#define IXGBE_82598_SCATTER 100 ./ixgbe/ixgbe.h:#define IXGBE_82599_SCATTER 32 ./ixgb/if_ixgb.h:#define IXGB_MAX_SCATTER 100 I wonder how many of these are hardware limits, or just I don't want to allocate too much space on the stack, as 16 bytes per bus_dma_segment_t (on amd64) adds up... The other question is should the drivers w/ a limit on the segments reduce the size of the TSO packet so that we don't need to m_defrag/m_collapse which are expensive operations... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."