Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 14 Sep 2015 13:01:40 +0200
From:      Hans Petter Selasky <hps@selasky.org>
To:        =?UTF-8?Q?Roger_Pau_Monn=c3=a9?= <royger@FreeBSD.org>, src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r271946 - in head/sys: dev/oce dev/vmware/vmxnet3 dev/xen/netfront kern net netinet ofed/drivers/net/mlx4 sys
Message-ID:  <55F6A914.6050109@selasky.org>
In-Reply-To: <55F6A694.7020404@FreeBSD.org>
References:  <201409220827.s8M8RRHB031526@svn.freebsd.org> <55F69093.5050807@FreeBSD.org> <55F6935C.9000000@selasky.org> <55F6A694.7020404@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 09/14/15 12:51, Roger Pau Monné wrote:
> El 14/09/15 a les 11.29, Hans Petter Selasky ha escrit:
>> On 09/14/15 11:17, Roger Pau Monné wrote:
>>> El 22/09/14 a les 10.27, Hans Petter Selasky ha escrit:
>>>> Author: hselasky
>>>> Date: Mon Sep 22 08:27:27 2014
>>>> New Revision: 271946
>>>> URL: http://svnweb.freebsd.org/changeset/base/271946
>>>>
>>>> Log:
>>>>     Improve transmit sending offload, TSO, algorithm in general.
>>>>
>>>>     The current TSO limitation feature only takes the total number of
>>>>     bytes in an mbuf chain into account and does not limit by the number
>>>>     of mbufs in a chain. Some kinds of hardware is limited by two
>>>>     factors. One is the fragment length and the second is the fragment
>>>>     count. Both of these limits need to be taken into account when doing
>>>>     TSO. Else some kinds of hardware might have to drop completely valid
>>>>     mbuf chains because they cannot loaded into the given hardware's DMA
>>>>     engine. The new way of doing TSO limitation has been made backwards
>>>>     compatible as input from other FreeBSD developers and will use
>>>>     defaults for values not set.
>>>>
>>>>     Reviewed by:    adrian, rmacklem
>>>>     Sponsored by:    Mellanox Technologies
>>>
>>> This commit makes xen-netfront tx performance drop from ~5Gbits/sec
>>> (with debug options enabled) to 446 Mbits/sec. I'm currently looking,
>>> but if anyone has ideas they are welcome.
>>>
>>
>> Hi Roger,
>>
>> Looking at the netfront code you should subtract 1 from tsomaxsegcount
>> prior to r287775. The reason might simply be that 2K clusters are used
>> instead of 4K clusters, causing m_defrag() to be called.
>>
>>>          ifp->if_hw_tsomax = 65536 - (ETHER_HDR_LEN +
>>> ETHER_VLAN_ENCAP_LEN);
>>>          ifp->if_hw_tsomaxsegcount = MAX_TX_REQ_FRAGS;
>>>          ifp->if_hw_tsomaxsegsize = PAGE_SIZE;
>>
>> After r287775 can you try these settings:
>>
>> ifp->if_hw_tsomax = 65536;
>> ifp->if_hw_tsomaxsegcount = MAX_TX_REQ_FRAGS;
>> ifp->if_hw_tsomaxsegsize = PAGE_SIZE;
>>
>> And see if the performance is the same like before?
>

Hi Roger,

> Yes, performance seems to be fine after setting if_hw_tsomax to 65536.
> Is there some documentation about the usage of if_hw_tsomax? Does the
> network subsystem already takes care of subtracting the space for ether
> header and the vlan encapsulation, so it's no longer needed to specify
> them in if_hw_tsomax?

In the past only the TCP and IP layers were accounted for by the TSO 
parameters. A the present all layers are accounted for. This might fit 
the kind of adapter you are using better, because it appears to me it is 
DMA'ing all of the mbuf chain. Some other network adapters only DMA the 
TCP payload data and copy the ETH/TCP/IP headers into a special DMA'able 
memory area.

>
> Also, this commit was MFC'ed to stable/10 and 10.2 suffers from the same
> problem. Can we issue and EN to get this fixed in 10.2?

When this patch has been given some time to settle, and more people have 
tested it, I can submit a request for re @ to do that. Please remind me 
if I forget.

--HPS




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?55F6A914.6050109>