From owner-svn-src-head@FreeBSD.ORG Sat Sep 13 20:35:39 2014 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 5BF78920; Sat, 13 Sep 2014 20:35:39 +0000 (UTC) Received: from mail.turbocat.net (mail.turbocat.net [IPv6:2a01:4f8:d16:4514::2]) (using TLSv1.1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CE5F47B6; Sat, 13 Sep 2014 20:35:38 +0000 (UTC) Received: from laptop015.home.selasky.org (cm-176.74.213.204.customer.telag.net [176.74.213.204]) (using TLSv1 with cipher ECDHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id 870BB1FE027; Sat, 13 Sep 2014 22:35:36 +0200 (CEST) Message-ID: <5414AA8C.8000809@selasky.org> Date: Sat, 13 Sep 2014 22:35:24 +0200 From: Hans Petter Selasky User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.1.0 MIME-Version: 1.0 To: Rick Macklem Subject: Re: svn commit: r271504 - in head/sys: dev/oce dev/vmware/vmxnet3 dev/xen/netfront net netinet ofed/drivers/net/mlx4 References: <1971026585.35877589.1410640100903.JavaMail.root@uoguelph.ca> In-Reply-To: <1971026585.35877589.1410640100903.JavaMail.root@uoguelph.ca> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Adrian Chadd X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Sep 2014 20:35:39 -0000 On 09/13/14 22:28, Rick Macklem wrote: > Hans Petter Selasky wrote: >> On 09/13/14 22:04, Rick Macklem wrote: >>> Hans Petter Selasky wrote: >>>> On 09/13/14 18:54, Adrian Chadd wrote: >>>>> Hi, >>>>> >>>>> Just for the record: >>>>> >>>>> * I'm glad you're tackling the TSO config stuff; >>>>> * I'm not glad you're trying to pack it into a u_int rather than >>>>> creating a new structure and adding fields for it. >>>>> >>>>> I appreciate that you're trying to rush this in before 10.1, but >>>>> this >>>>> is exactly why things shouldn't be rushed in before release >>>>> deadlines. >>>>> :) >>>>> >>>>> I'd really like to see this be broken out as a structure and the >>>>> bit >>>>> shifting games for what really shouldn't be packed into a u_int >>>>> fixed. >>>>> Otherwise this is going to be deadweight that has to persist past >>>>> 11.0. >>>>> >>>> >>>> Hi Adrian, >>>> >>>> I can make that change for -current, making the new structure and >>>> such. >>>> This change was intended for 10 where there is only one u_int for >>>> this >>>> information. Or do you want me to change that in 10 too? >>>> >>> Well, there are spare fields (if_ispare[4]) in struct ifnet that I >>> believe can be used for new u_ints when MFC'ng a patch that adds >>> fields to struct ifnet in head. (If I have this wrong, someone >>> please >>> correct me.) >>> >>> I'll admit I don't really see an advantage to defining a structure >>> vs >>> just defining a couple of additional u_ints, but so long as the >>> structure >>> doesn't cause alignment issues for any arch, I don't see a problem >>> with >>> a structure. >>> >>> I tend to agree with Adrian that this shouldn't be rushed. (I, >>> personally, >>> think that if_hw_tsomax was poorly chosen, but that is already in >>> use, so >>> I think we need to add to that and not replace it.) >>> >>> I also hope that your testing has included quite a bit of activity >>> on >>> an NFS mount using TSO and the default 64K rsize, wsize, since that >>> is >>> going to generate a bunch of 35 mbuf transmit fragment lists and >>> there >>> is an edge case where the total data length excluding ethernet >>> header >>> is just under 64K (by less than the ethernet header length) where >>> the >>> list must be split by tcp_output() to avoid disaster. >> >> Hi, >> >> The ethernet and VLAN headers are still subtracted. >> > Where? I couldn't see it when I glanced at the patch. > (hdrlen in tcp_output() does not include ethernet header length and > that is the only thing I see subtracted from the max_len.) Hi Rick, When the drivers setup the "if_hw_tsomax" field, they need to subtract the ethernet and vlan headers from the maximum TSO payload. For example here: + sc->ifp->if_hw_tsomax = IF_HW_TSOMAX_BUILD_VALUE( + 65535 - (ETHER_HDR_LEN + ETHER_VLAN_ENCAP_LEN) /* bytes */, + OCE_MAX_TX_ELEMENTS /* maximum frag count */, + 12 /* 4K frag size */); > > I see the default set to (65536 - 4). I don't know why you subtracted 4 > but I would have expected the max ethernet header length to be subtracted > here? That is another technical point. If you have a bunch of data to transfer you want the start and stop physical addresses to be aligned to some boundary, like cache line or 32-bit or 64-bit, because then the hardware doesn't have to do the byte shifting when it starts reading the data payload - right? > > Note that this must be subtracted before use by tcp_output() because there > are several network device drivers that support 32 transmit segments and this > means that the TSO segment including ethernet headers must fit in 65536 bytes > (32 * MCLBYTES). If it does not, then NFS over these devices is busted because > even m_defrag() can`t make them fit. I only found a few network drivers which actually set the TSO limit, and for the rest: The default limit is 255 frags of MAX 65536 bytes, which should not be reached in the cases you are describing. --HPS