Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 24 Mar 2017 16:39:11 -0700
From:      Navdeep Parhar <np@FreeBSD.org>
To:        "Caraballo-vega, Jordan A. (GSFC-6062)[COMPUTER SCIENCE CORP]" <jordancaraballo87@gmail.com>
Cc:        slw@zxy.spb.ru, freebsd-net@freebsd.org, John Jasen <jjasen@gmail.com>
Subject:   Re: bad throughput performance on multiple systems: Re: Fwd: Re: Disappointing packets-per-second performance results on a Dell,PE R530
Message-ID:  <0a4e3073-bf5f-9bf8-533f-bd9ec3c0f60c@FreeBSD.org>
In-Reply-To: <dbfa3693-3306-8bf8-1e9a-b305f8007239@gmail.com>
References:  <20170312231826.GV15630@zxy.spb.ru> <74654520-b8b6-6118-2e46-902a8ea107ac@gmail.com> <CAPFoGT9k4HfDCQ7wJPDFMTrJTtDyc9uK_ma9ubneDhVSsS-jcA@mail.gmail.com> <173fffac-7ae2-786a-66c0-e9cd7ab78f44@gmail.com> <CAPFoGT-BAMpj34wtB06dxMKk%2B87OEOs5-qu%2BRLVz=aPrhX6hDA@mail.gmail.com> <CAACLuR29xQhDWATRheBaOU2vtiYp61JgDKHaXum%2BU32MBDLBzw@mail.gmail.com> <20170317100814.GN70430@zxy.spb.ru> <9924b2d5-4a72-579c-96c6-4dbdacc07c95@gmail.com> <CAPFoGT8C1%2BZYDSFVGjCuT8v5%2B=izfXgHXEn5P2dqxAq6N6ypCA@mail.gmail.com> <9694e9f2-daec-924d-e9f6-7b22a634acb5@gmail.com> <20170318052837.GA21730@ox> <dbfa3693-3306-8bf8-1e9a-b305f8007239@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 03/24/2017 16:07, Caraballo-vega, Jordan A. (GSFC-6062)[COMPUTER 
SCIENCE CORP] wrote:
> At the time of implementing the vcxl* interfaces we get very bad results.

You're probably not using netmap with the vcxl interfaces, and the 
number of "normal" tx and rx queues is just 2 for these interfaces.

Even if you _are_ using netmap, the hw.cxgbe.nnmtxq10g/rxq10g tunables 
don't work anymore.  Use these to control the number of queues for netmap:
hw.cxgbe.nnmtxq_vi
hw.cxgbe.nnmrxq_vi

You should see a line like this in dmesg for all cxl/vcxl interfaces and 
that tells you exactly how many queues the driver configured:
cxl0: 4 txq, 4 rxq (NIC); 4 txq, 2 rxq (TOE)

> 
> packets  errs idrops      bytes    packets  errs      bytes colls drops
>        629k  4.5k     0        66M       629k     0        66M     0     0
>        701k  5.0k     0        74M       701k     0        74M     0     0
>        668k  4.8k     0        70M       668k     0        70M     0     0
>        667k  4.8k     0        70M       667k     0        70M     0     0
>        645k  4.5k     0        68M       645k     0        68M     0     0
>        686k  4.9k     0        72M       686k     0        72M     0     0
> 
> And by using just the cxl* interfaces we were getting about
> 
>              input        (Total)           output
>     packets  errs idrops      bytes    packets  errs      bytes colls drops
>        2.8M     0  1.2M       294M       1.6M     0       171M     0     0
>        2.8M     0  1.2M       294M       1.6M     0       171M     0     0
>        2.8M     0  1.2M       294M       1.6M     0       171M     0     0
>        2.8M     0  1.2M       295M       1.6M     0       172M     0     0
>        2.8M     0  1.2M       295M       1.6M     0       171M     0     0
> 
> These are our configurations for now. Any advice or suggestion will be
> appreciated.

What I don't understand is that you have PAUSE disabled and congestion 
drops enabled but still the number of packets coming in (whether they 
are dropped eventually or not is irrelevant here) is very low in your 
experiments.  It's almost as if the senders are backing off in the face 
of packet loss.  Are you using TCP or UDP?  Always use UDP for pps 
testing -- the senders need to be relentless.

Regards,
Navdeep

> 
> /etc/rc.conf configurations
> 
> ifconfig_cxl0="up"
> ifconfig_cxl1="up"
> ifconfig_vcxl0="inet 172.16.2.1/24 -tso -lro mtu 9000"
> ifconfig_vcxl1="inet 172.16.1.1/24 -tso -lro mtu 9000"
> gateway_enable="YES"
> 
> /boot/loader.conf configurations
> 
> # Chelsio Modules
> t4fw_cfg_load="YES"
> t5fw_cfg_load="YES"
> if_cxgbe_load="YES"
> 
> # rx and tx size
> dev.cxl.0.qsize_txq=8192
> dev.cxl.0.qsize_rxq=8192
> dev.cxl.1.qsize_txq=8192
> dev.cxl.1.qsize_rxq=8192
> 
> # drop toecaps to increase queues
> dev.t5nex.0.toecaps=0
> dev.t5nex.0.rdmacaps=0
> dev.t5nex.0.iscsicaps=0
> dev.t5nex.0.fcoecaps=0
> 
> # Controls the hardware response to congestion.  -1 disables
> # congestion feedback and is not recommended.  0 instructs the
> # hardware to backpressure its pipeline on congestion.  This
> # usually results in the port emitting PAUSE frames.  1 instructs
> # the hardware to drop frames destined for congested queues. From cxgbe
> dev.t5nex.0.cong_drop=1
> 
> # Saw these recomendations in Vicenzo email thread
> hw.cxgbe.num_vis=2
> hw.cxgbe.fl_pktshift=0
> hw.cxgbe.toecaps_allowed=0
> hw.cxgbe.nnmtxq10g=8
> hw.cxgbe.nnmrxq10g=8
> 
> /etc/sysctl.conf configurations
> 
> # Turning off pauses
> dev.cxl.0.pause_settings=0
> dev.cxl.1.pause_settings=0
> # John Jasen suggestion - March 24, 2017
> net.isr.bindthreads=0
> net.isr.maxthreads=24
> 
> 
> On 3/18/17 1:28 AM, Navdeep Parhar wrote:
>> On Fri, Mar 17, 2017 at 11:43:32PM -0400, John Jasen wrote:
>>> On 03/17/2017 03:32 PM, Navdeep Parhar wrote:
>>>
>>>> On Fri, Mar 17, 2017 at 12:21 PM, John Jasen <jjasen@gmail.com> wrote:
>>>>> Yes.
>>>>> We were hopeful, initially, to be able to achieve higher packet
>>>>> forwarding rates through either netmap-fwd or due to enhancements based
>>>>> off https://wiki.freebsd.org/ProjectsRoutingProposal
>>>> Have you tried netmap-fwd?  I'd be interested in how that did in your tests.
>>> We have. On this particular box, (11-STABLE, netmap-fwd fresh from git)
>>> it took about 1.7m pps in, dropped 500k, and passed about 800k.
>>>
>>> I'm lead to believe that vcxl interfaces may yield better results?
>> Yes, those are the ones with native netmap support.  Any netmap based
>> application should use the vcxl interfaces.  If you used them on the
>> main cxl interfaces you were running netmap in emulated mode.
>>
>> Regards,
>> Navdeep
> 




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0a4e3073-bf5f-9bf8-533f-bd9ec3c0f60c>