Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Aug 2015 17:54:08 -0400 (EDT)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Hans Petter Selasky <hps@selasky.org>
Cc:        Daniel Braniss <danny@cs.huji.ac.il>, FreeBSD Net <freebsd-net@freebsd.org>, Christopher Forgeron <csforgeron@gmail.com>,  FreeBSD stable <freebsd-stable@freebsd.org>,  Slawa Olhovchenkov <slw@zxy.spb.ru>
Subject:   Re: ix(intel) vs mlxen(mellanox) 10Gb performance
Message-ID:  <1325951625.25292515.1439934848268.JavaMail.zimbra@uoguelph.ca>
In-Reply-To: <55D333D6.5040102@selasky.org>
References:  <1D52028A-B39F-4F9B-BD38-CB1D73BF5D56@cs.huji.ac.il> <17871443-E105-4434-80B1-6939306A865F@cs.huji.ac.il> <473274181.23263108.1439814072514.JavaMail.zimbra@uoguelph.ca> <7F892C70-9C04-4468-9514-EDBFE75CF2C6@cs.huji.ac.il> <805850043.24018217.1439848150695.JavaMail.zimbra@uoguelph.ca> <9D8B0503-E8FA-43CA-88F0-01F184F84D9B@cs.huji.ac.il> <1721122651.24481798.1439902381663.JavaMail.zimbra@uoguelph.ca> <55D333D6.5040102@selasky.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Hans Petter Selasky wrote:
> On 08/18/15 14:53, Rick Macklem wrote:
> > 2572 	  	         ifp->if_hw_tsomax = 65518;
> >> >2573 	  	         ifp->if_hw_tsomaxsegcount = IXGBE_82599_SCATTER;
> >> >2574 	  	         ifp->if_hw_tsomaxsegsize = 2048;
> 
> Hi,
> 
> If IXGBE_82599_SCATTER is the maximum scatter/gather entries the
> hardware can do, remember to subtract one fragment for the TCP/IP-header
> mbuf!
> 
Ouch! Yes, I now see that the code that counts the # of mbufs is before the
code that adds the tcp/ip header mbuf.

In my opinion, this should be fixed by setting if_hw_tsomaxsegcount to whatever
the driver provides - 1. It is not the driver's responsibility to know if a tcp/ip
header mbuf will be added and is a lot less confusing that expecting the driver
author to know to subtract one. (I had mistakenly thought that tcp_output() had
added the tc/ip header mbuf before the loop that counts mbufs in the list. Btw,
this tcp/ip header mbuf also has leading space for the MAC layer header.)

> I think there is an off-by-one here:
> 
> ifp->if_hw_tsomax = 65518;
> ifp->if_hw_tsomaxsegcount = IXGBE_82599_SCATTER - 1;
> ifp->if_hw_tsomaxsegsize = 2048;
> 
> Refer to:
> 
> >          *
> >          * NOTE: The TSO limits only apply to the data payload part of
> >          * a TCP/IP packet. That means there is no need to subtract
> >          * space for ethernet-, vlan-, IP- or TCP- headers from the
> >          * TSO limits unless the hardware driver in question requires
> >          * so.
> 
This comment suggests that the driver author doesn't need to do this.

However, unless this is fixed in tcp_output(), the above patch should be
applied to the driver.
> In sys/net/if_var.h
> 
> Thank you!
> 
> --HPS
> 
The problem I see is that, after doing the calculation of how many mbufs can
be in the TSO segment, the code in tcp_output() will have calculated a value
for "len" that will always be less that "tp->t_maxopd - optlen" when the
if_hw_tsosegcount limit has been hit (see where it does a "break;" out of
the while loop).
--> This does not imply "too many small fragments" for NFS, just that the
    driver's transmit segment limit has been reached, where most of them
    are mbuf clusters, but not the first ones.
As such the code:
			/*
			 * In case there are too many small fragments
			 * don't use TSO:
			 */
			if (len <= max_len) {
				len = max_len;
				sendalot = 1;
				tso = 0;
			}
Will always happen for this case and "tso" gets set to 0. Not what we want to
happen, imho.
The above code block was what I suggested should be commented out or deleted
for the test.

It appears you should also add the "- 1" in the driver sys/dev/ixgbe/if_ix.c.

rick

> _______________________________________________
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
> 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1325951625.25292515.1439934848268.JavaMail.zimbra>