Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 23 Jan 2016 16:11:02 -0800
From:      Luigi Rizzo <rizzo@iet.unipi.it>
To:        Luigi Rizzo <rizzo@iet.unipi.it>, Marcus Cenzatti <cenzatti@hush.com>,  "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, Navdeep Parhar <nparhar@gmail.com>
Subject:   Re: Chelsio T520-SO-CR low performance (netmap tested) for RX
Message-ID:  <CA%2BhQ2%2BgBiBdf1QtSCmR=ikDCo1LKYcs6NaTkEnDS9LZGkHn7rQ@mail.gmail.com>
In-Reply-To: <20160123211816.GE4574@ox>
References:  <20160123053428.2091EA0121@smtp.hushmail.com> <20160123154052.GA4574@ox> <20160123171300.0F448A0121@smtp.hushmail.com> <CA%2BhQ2%2Bg4kU4LA4PexRPBv7z49ZWh-mDqdpw18SeoYaBueHyjZg@mail.gmail.com> <20160123174840.32B1DA0121@smtp.hushmail.com> <20160123183836.GB4574@ox> <CA%2BhQ2%2BiWKRhhFntjkkYyfwSmbCrKrw8BkEhUXFbbf0hG%2BQs0yA@mail.gmail.com> <20160123211816.GE4574@ox>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Jan 23, 2016 at 1:18 PM, Navdeep Parhar <nparhar@gmail.com> wrote:
> On Sat, Jan 23, 2016 at 11:12:28AM -0800, Luigi Rizzo wrote:
>> On Sat, Jan 23, 2016 at 10:38 AM, Navdeep Parhar <nparhar@gmail.com> wrote:
>> > On Sat, Jan 23, 2016 at 03:48:39PM -0200, Marcus Cenzatti wrote:
>> > ...
>> >>
>> >> woops, my bad, yes probably we had some drop, with -S and -D now I get 1.2Mpps.
>> >
>> > Run "netstat -hdw1 -i cxl<n>" on the receiver during your test.
>>
>> Navdeep, does this give any info on the ncxl port rather
>> than the cxl port connected to the host stack ?
>
> You're right, it should have been "netstat -hdw 1 -I ncxl<n>".  In these
> kinds of experiments it might even be best to run two netstats in
> parallel on cxl and ncxl.
>
>>
>> ...
>> > Do you know if the transmitter will pad up so as not to put runts on the
>> > wire?  If not then you might want to bump up the size of the frame
>> > explicitly (there's some pkt-gen knob for this).
>> >
>>
>> ix/ixl do automatic padding, and in any case pkt-gen
>> by default generates valid packet sizes (and so it does
>> with the variable-size tests I suggested).
>>
>> Is there any parameter that controls interrupt moderation ?
>>
>> In any case we need to know the numbers when sending to the
>> ncxl MAC address as opposed to broadcast.
>>
>> I suspects one of these problems:
>>
>> - the interrupt moderation interval is too long thus limiting
>>   the rate to one ring/interval. Unlikely though, even
>>   with 1k slots, the measured 1.2 Mpps corresponds to almost
>>   1ms which is too long
>>
>> - the receiver cannot cope with the input load and somehow
>>   takes a long time to recover from congestion. If this is
>>   the case, running the sender at a lower rate might reach
>>   a peak throughput > 1.2 Mpps when the receiver can still
>>   keep up, and then drop to the low rate under congestion.
>>
>> - and of course bus errors, when the device is connected on
>>   a PCIe slot with only 1-2 data lanes.
>>   This actually happens a lot, physical connector sizes
>>   do not reflect the number of active PCIe lanes.
>
> There are no drops or PAUSE or any sign of backpressure.  The netstat
> counters show 900K incoming and 0 drops/errors, which means 900K packets
> on the wire for the port and all were delivered to the driver
> successfully.

I am not 100% convinced by the above explanation, both because
of the way the experiment was conducted (broadcast
destination, duplication in the hw to two queues,
stats only from cxl0 and not ncxl0), and because sending
to the unicast DMAC already showed a higher throughput.

So I suspect the 900k counter does not actually reflect
packets on the wire. Where and how the drops are counted
I have no idea (presumably some internal counter in the NIC,
but for broadcast traffic replicated to multiple queues
I don't know which counter tracks the drop - one per queue,
a global one, etc.


> The mismatch in the transmitter's counter and the incoming counter can
> only be explained by
> a) Frames whose DMAC address didn't match the local interface's MAC.

the DMAC was the same for all frames.

> This can be tested by switching cxl0 and ncxl0 to promisc mode to see if
> that opens the flood gates.
> b) Frames mangled badly enough to be discarded.  But these should show
> as an error or drop in at least one of these:
>
> sysctl dev.cxl.<n>.stats
> sysctl -n dev.t5nex.0.misc.tp_err_stats

one question -- do the sysctl for cxl.<n> and t5nex.0
also report the ncxl* stats, or you have a separate entry
for those ?

cheers
luigi



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2BhQ2%2BgBiBdf1QtSCmR=ikDCo1LKYcs6NaTkEnDS9LZGkHn7rQ>