Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 23 Jan 2016 23:23:01 -0800
From:      Luigi Rizzo <rizzo@iet.unipi.it>
To:        Luigi Rizzo <rizzo@iet.unipi.it>, Marcus Cenzatti <cenzatti@hush.com>,  "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, Navdeep Parhar <nparhar@gmail.com>
Subject:   Re: solved: Re: Chelsio T520-SO-CR low performance (netmap tested) for RX
Message-ID:  <CA%2BhQ2%2Bg1_A-bEkoqxTSWWsok-9%2B=vZSurCTt%2Bj-EK1597ff8jw@mail.gmail.com>
In-Reply-To: <20160124064217.GB7567@ox>
References:  <CA%2BhQ2%2Bg7_haaXLFjMuG00ANsUkFdyGzFQyjT4NYVBmPY-vECBg@mail.gmail.com> <20160124042830.3D674A0128@smtp.hushmail.com> <CA%2BhQ2%2BhxOZkGJdRSrmxSqHforLbMWBVQcayrNFNLLkU803hmjA@mail.gmail.com> <20160124064217.GB7567@ox>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Jan 23, 2016 at 10:42 PM, Navdeep Parhar <nparhar@gmail.com> wrote:
> On Sat, Jan 23, 2016 at 09:33:32PM -0800, Luigi Rizzo wrote:
>> On Sat, Jan 23, 2016 at 8:28 PM, Marcus Cenzatti <cenzatti@hush.com> wrote:
>> >
>> >
>> > On 1/24/2016 at 1:10 AM, "Luigi Rizzo" <rizzo@iet.unipi.it> wrote:
...
>> One last attempt: try use -l 64 on the sender, this will generate 64+4 byte
>> packets, which may become just 64 on the receiver if the chelsio is configured
>> to strip the CRC. This should result in well aligned PCIe transactions and
>> reduced PCIe traffic, which may help (the ix driver has a similar problem,
>> but since it does not strip the CRC can rx at line rate with 60 bytes but not
>> with 64).
>
> Keep hw.cxgbe.fl_pktshift in mind for these kind of tests.  The default
> value is 2 so the chip DMAs payload at an offset of 2B from the start of
> the rx buffer.  So you'll need to adjust your frame size by 2 (66B on
> the wire, 62B after CRC is removed, making it exactly 64B across PCIe if
> pktshift is 2) or just set hw.cxgbe.fl_pktshift=0 in /boot/loader.conf.

This sounds like something to fix.

In netmap packets are supposed to be aligned with the
beginning of the buffer, so when operating in netmap
mode at least the pktshift should be set to 0.
If it is not possible to have it per- queue, I would
suggest to set it to 0 unconditionally when you compile
a netmap-enabled kernel.

How much of a performance boost do you see with a
shift of 2 with regular traffic ? Modern CPUs seem
to be pretty good with unaligned memory accesses.

cheers
luigi

> Regards,
> Navdeep



-- 
-----------------------------------------+-------------------------------
 Prof. Luigi RIZZO, rizzo@iet.unipi.it  . Dip. di Ing. dell'Informazione
 http://www.iet.unipi.it/~luigi/        . Universita` di Pisa
 TEL      +39-050-2217533               . via Diotisalvi 2
 Mobile   +39-338-6809875               . 56122 PISA (Italy)
-----------------------------------------+-------------------------------



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2BhQ2%2Bg1_A-bEkoqxTSWWsok-9%2B=vZSurCTt%2Bj-EK1597ff8jw>