Date: Tue, 15 Sep 2015 10:32:03 +0200 From: =?UTF-8?Q?Roger_Pau_Monn=c3=a9?= <royger@FreeBSD.org> To: Hans Petter Selasky <hps@selasky.org>, src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r271946 - in head/sys: dev/oce dev/vmware/vmxnet3 dev/xen/netfront kern net netinet ofed/drivers/net/mlx4 sys Message-ID: <55F7D783.1080406@FreeBSD.org> In-Reply-To: <55F6935C.9000000@selasky.org> References: <201409220827.s8M8RRHB031526@svn.freebsd.org> <55F69093.5050807@FreeBSD.org> <55F6935C.9000000@selasky.org>
next in thread | previous in thread | raw e-mail | index | archive | help
El 14/09/15 a les 11.29, Hans Petter Selasky ha escrit: > On 09/14/15 11:17, Roger Pau Monné wrote: >> El 22/09/14 a les 10.27, Hans Petter Selasky ha escrit: >>> Author: hselasky >>> Date: Mon Sep 22 08:27:27 2014 >>> New Revision: 271946 >>> URL: http://svnweb.freebsd.org/changeset/base/271946 >>> >>> Log: >>> Improve transmit sending offload, TSO, algorithm in general. >>> >>> The current TSO limitation feature only takes the total number of >>> bytes in an mbuf chain into account and does not limit by the number >>> of mbufs in a chain. Some kinds of hardware is limited by two >>> factors. One is the fragment length and the second is the fragment >>> count. Both of these limits need to be taken into account when doing >>> TSO. Else some kinds of hardware might have to drop completely valid >>> mbuf chains because they cannot loaded into the given hardware's DMA >>> engine. The new way of doing TSO limitation has been made backwards >>> compatible as input from other FreeBSD developers and will use >>> defaults for values not set. >>> >>> Reviewed by: adrian, rmacklem >>> Sponsored by: Mellanox Technologies >> >> This commit makes xen-netfront tx performance drop from ~5Gbits/sec >> (with debug options enabled) to 446 Mbits/sec. I'm currently looking, >> but if anyone has ideas they are welcome. >> > > Hi Roger, > > Looking at the netfront code you should subtract 1 from tsomaxsegcount > prior to r287775. The reason might simply be that 2K clusters are used > instead of 4K clusters, causing m_defrag() to be called. > >> ifp->if_hw_tsomax = 65536 - (ETHER_HDR_LEN + >> ETHER_VLAN_ENCAP_LEN); >> ifp->if_hw_tsomaxsegcount = MAX_TX_REQ_FRAGS; >> ifp->if_hw_tsomaxsegsize = PAGE_SIZE; > > After r287775 can you try these settings: > > ifp->if_hw_tsomax = 65536; > ifp->if_hw_tsomaxsegcount = MAX_TX_REQ_FRAGS; > ifp->if_hw_tsomaxsegsize = PAGE_SIZE; > > And see if the performance is the same like before? FWIW, just using r287775 seems to solve the problem, even if I leave if_hw_tsomax with it's current value. Roger.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?55F7D783.1080406>