Date: Sat, 13 Sep 2014 22:35:24 +0200 From: Hans Petter Selasky <hps@selasky.org> To: Rick Macklem <rmacklem@uoguelph.ca> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Adrian Chadd <adrian@freebsd.org> Subject: Re: svn commit: r271504 - in head/sys: dev/oce dev/vmware/vmxnet3 dev/xen/netfront net netinet ofed/drivers/net/mlx4 Message-ID: <5414AA8C.8000809@selasky.org> In-Reply-To: <1971026585.35877589.1410640100903.JavaMail.root@uoguelph.ca> References: <1971026585.35877589.1410640100903.JavaMail.root@uoguelph.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
On 09/13/14 22:28, Rick Macklem wrote: > Hans Petter Selasky wrote: >> On 09/13/14 22:04, Rick Macklem wrote: >>> Hans Petter Selasky wrote: >>>> On 09/13/14 18:54, Adrian Chadd wrote: >>>>> Hi, >>>>> >>>>> Just for the record: >>>>> >>>>> * I'm glad you're tackling the TSO config stuff; >>>>> * I'm not glad you're trying to pack it into a u_int rather than >>>>> creating a new structure and adding fields for it. >>>>> >>>>> I appreciate that you're trying to rush this in before 10.1, but >>>>> this >>>>> is exactly why things shouldn't be rushed in before release >>>>> deadlines. >>>>> :) >>>>> >>>>> I'd really like to see this be broken out as a structure and the >>>>> bit >>>>> shifting games for what really shouldn't be packed into a u_int >>>>> fixed. >>>>> Otherwise this is going to be deadweight that has to persist past >>>>> 11.0. >>>>> >>>> >>>> Hi Adrian, >>>> >>>> I can make that change for -current, making the new structure and >>>> such. >>>> This change was intended for 10 where there is only one u_int for >>>> this >>>> information. Or do you want me to change that in 10 too? >>>> >>> Well, there are spare fields (if_ispare[4]) in struct ifnet that I >>> believe can be used for new u_ints when MFC'ng a patch that adds >>> fields to struct ifnet in head. (If I have this wrong, someone >>> please >>> correct me.) >>> >>> I'll admit I don't really see an advantage to defining a structure >>> vs >>> just defining a couple of additional u_ints, but so long as the >>> structure >>> doesn't cause alignment issues for any arch, I don't see a problem >>> with >>> a structure. >>> >>> I tend to agree with Adrian that this shouldn't be rushed. (I, >>> personally, >>> think that if_hw_tsomax was poorly chosen, but that is already in >>> use, so >>> I think we need to add to that and not replace it.) >>> >>> I also hope that your testing has included quite a bit of activity >>> on >>> an NFS mount using TSO and the default 64K rsize, wsize, since that >>> is >>> going to generate a bunch of 35 mbuf transmit fragment lists and >>> there >>> is an edge case where the total data length excluding ethernet >>> header >>> is just under 64K (by less than the ethernet header length) where >>> the >>> list must be split by tcp_output() to avoid disaster. >> >> Hi, >> >> The ethernet and VLAN headers are still subtracted. >> > Where? I couldn't see it when I glanced at the patch. > (hdrlen in tcp_output() does not include ethernet header length and > that is the only thing I see subtracted from the max_len.) Hi Rick, When the drivers setup the "if_hw_tsomax" field, they need to subtract the ethernet and vlan headers from the maximum TSO payload. For example here: + sc->ifp->if_hw_tsomax = IF_HW_TSOMAX_BUILD_VALUE( + 65535 - (ETHER_HDR_LEN + ETHER_VLAN_ENCAP_LEN) /* bytes */, + OCE_MAX_TX_ELEMENTS /* maximum frag count */, + 12 /* 4K frag size */); > > I see the default set to (65536 - 4). I don't know why you subtracted 4 > but I would have expected the max ethernet header length to be subtracted > here? That is another technical point. If you have a bunch of data to transfer you want the start and stop physical addresses to be aligned to some boundary, like cache line or 32-bit or 64-bit, because then the hardware doesn't have to do the byte shifting when it starts reading the data payload - right? > > Note that this must be subtracted before use by tcp_output() because there > are several network device drivers that support 32 transmit segments and this > means that the TSO segment including ethernet headers must fit in 65536 bytes > (32 * MCLBYTES). If it does not, then NFS over these devices is busted because > even m_defrag() can`t make them fit. I only found a few network drivers which actually set the TSO limit, and for the rest: The default limit is 255 frags of MAX 65536 bytes, which should not be reached in the cases you are describing. --HPS
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5414AA8C.8000809>