Date: Tue, 18 Aug 2015 17:54:08 -0400 (EDT) From: Rick Macklem <rmacklem@uoguelph.ca> To: Hans Petter Selasky <hps@selasky.org> Cc: Daniel Braniss <danny@cs.huji.ac.il>, FreeBSD Net <freebsd-net@freebsd.org>, Christopher Forgeron <csforgeron@gmail.com>, FreeBSD stable <freebsd-stable@freebsd.org>, Slawa Olhovchenkov <slw@zxy.spb.ru> Subject: Re: ix(intel) vs mlxen(mellanox) 10Gb performance Message-ID: <1325951625.25292515.1439934848268.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <55D333D6.5040102@selasky.org> References: <1D52028A-B39F-4F9B-BD38-CB1D73BF5D56@cs.huji.ac.il> <17871443-E105-4434-80B1-6939306A865F@cs.huji.ac.il> <473274181.23263108.1439814072514.JavaMail.zimbra@uoguelph.ca> <7F892C70-9C04-4468-9514-EDBFE75CF2C6@cs.huji.ac.il> <805850043.24018217.1439848150695.JavaMail.zimbra@uoguelph.ca> <9D8B0503-E8FA-43CA-88F0-01F184F84D9B@cs.huji.ac.il> <1721122651.24481798.1439902381663.JavaMail.zimbra@uoguelph.ca> <55D333D6.5040102@selasky.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Hans Petter Selasky wrote: > On 08/18/15 14:53, Rick Macklem wrote: > > 2572 ifp->if_hw_tsomax = 65518; > >> >2573 ifp->if_hw_tsomaxsegcount = IXGBE_82599_SCATTER; > >> >2574 ifp->if_hw_tsomaxsegsize = 2048; > > Hi, > > If IXGBE_82599_SCATTER is the maximum scatter/gather entries the > hardware can do, remember to subtract one fragment for the TCP/IP-header > mbuf! > Ouch! Yes, I now see that the code that counts the # of mbufs is before the code that adds the tcp/ip header mbuf. In my opinion, this should be fixed by setting if_hw_tsomaxsegcount to whatever the driver provides - 1. It is not the driver's responsibility to know if a tcp/ip header mbuf will be added and is a lot less confusing that expecting the driver author to know to subtract one. (I had mistakenly thought that tcp_output() had added the tc/ip header mbuf before the loop that counts mbufs in the list. Btw, this tcp/ip header mbuf also has leading space for the MAC layer header.) > I think there is an off-by-one here: > > ifp->if_hw_tsomax = 65518; > ifp->if_hw_tsomaxsegcount = IXGBE_82599_SCATTER - 1; > ifp->if_hw_tsomaxsegsize = 2048; > > Refer to: > > > * > > * NOTE: The TSO limits only apply to the data payload part of > > * a TCP/IP packet. That means there is no need to subtract > > * space for ethernet-, vlan-, IP- or TCP- headers from the > > * TSO limits unless the hardware driver in question requires > > * so. > This comment suggests that the driver author doesn't need to do this. However, unless this is fixed in tcp_output(), the above patch should be applied to the driver. > In sys/net/if_var.h > > Thank you! > > --HPS > The problem I see is that, after doing the calculation of how many mbufs can be in the TSO segment, the code in tcp_output() will have calculated a value for "len" that will always be less that "tp->t_maxopd - optlen" when the if_hw_tsosegcount limit has been hit (see where it does a "break;" out of the while loop). --> This does not imply "too many small fragments" for NFS, just that the driver's transmit segment limit has been reached, where most of them are mbuf clusters, but not the first ones. As such the code: /* * In case there are too many small fragments * don't use TSO: */ if (len <= max_len) { len = max_len; sendalot = 1; tso = 0; } Will always happen for this case and "tso" gets set to 0. Not what we want to happen, imho. The above code block was what I suggested should be commented out or deleted for the test. It appears you should also add the "- 1" in the driver sys/dev/ixgbe/if_ix.c. rick > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1325951625.25292515.1439934848268.JavaMail.zimbra>