Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 8 Oct 2015 11:34:14 +0200
From:      Hans Petter Selasky <hps@selasky.org>
To:        Daniel Braniss <danny@cs.huji.ac.il>, Rick Macklem <rmacklem@uoguelph.ca>
Cc:        pyunyh@gmail.com, FreeBSD stable <freebsd-stable@freebsd.org>, FreeBSD Net <freebsd-net@freebsd.org>
Subject:   Re: ix(intel) vs mlxen(mellanox) 10Gb performance
Message-ID:  <56163896.3020907@selasky.org>
In-Reply-To: <1E679659-BA50-42C3-B569-03579E322685@cs.huji.ac.il>
References:  <1D52028A-B39F-4F9B-BD38-CB1D73BF5D56@cs.huji.ac.il> <1153838447.28656490.1440193567940.JavaMail.zimbra@uoguelph.ca> <15D19823-08F7-4E55-BBD0-CE230F67D26E@cs.huji.ac.il> <818666007.28930310.1440244756872.JavaMail.zimbra@uoguelph.ca> <49173B1F-7B5E-4D59-8651-63D97B0CB5AC@cs.huji.ac.il> <1815942485.29539597.1440370972998.JavaMail.zimbra@uoguelph.ca> <55DAC623.60006@selasky.org> <62C7B1A3-CC6B-41A1-B254-6399F19F8FF7@cs.huji.ac.il> <2112273205.29795512.1440419111720.JavaMail.zimbra@uoguelph.ca> <1E679659-BA50-42C3-B569-03579E322685@cs.huji.ac.il>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi,

I've now MFC'ed r287775 to 10-stable and 9-stable. I hope this will 
resolve the issues with m_defrag() being called on too long mbuf chains 
due to an off-by-one in the driver TSO parameters and that it will be 
easier to maintain these parameters in the future.

Some comments were made that we might want to have an option to select 
if the IP-header should be counted or not. Certain network drivers 
require copying of the whole ETH/TCP/IP-header into separate memory 
areas, and can then handle one more data payload mbuf for TSO. Others 
required DMA-ing of the whole mbuf TSO chain. I think it is acceptable 
to have one TX-DMA segment slot free, in case of 2K mbuf clusters being 
used for TSO. From my experience the limitation typically kicks in when 
2K mbuf clusters are used for TSO instead of 4K mbuf clusters. 65536 / 
4096 = 16, whereas 65536 / 2048 = 32. If an ethernet hardware driver has 
a limitation of 24 data segments (mlxen), and assuming that each mbuf 
represent a single segment, then iff the majority of mbufs being 
transmitted are 2K clusters we may have a small, 1/24 = 4.2%, loss of TX 
capability per TSO packet. From what I've seen using iperf, which in 
turn calls m_uiotombuf() which in turn calls m_getm2(), MJUMPPAGESIZE'ed 
mbuf clusters are preferred for large data transfers, so this issue 
might only happen in case of NODELAY being used on the socket and if the 
writes are small from the application point of view.  If an application 
is writing small amounts of data per send() system call, it is expected 
to degrade the system performance.

Please file a PR if it becomes an issue.

Someone asked me to MFC r287775 to 10.X release aswell. Is this still 
required?

--HPS



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?56163896.3020907>