From owner-freebsd-net@freebsd.org Tue Feb 4 16:20:08 2020 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 3CB0F22A9C3 for ; Tue, 4 Feb 2020 16:20:08 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 48Bqfc0lMtz45wW; Tue, 4 Feb 2020 16:20:07 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1iz0vl-0009KP-7N; Tue, 04 Feb 2020 19:20:05 +0300 Date: Tue, 4 Feb 2020 19:20:05 +0300 From: Slawa Olhovchenkov To: Navdeep Parhar Cc: freebsd-net@freebsd.org Subject: Re: Chelsio NETMAP performance Message-ID: <20200204162005.GC8012@zxy.spb.ru> References: <20200203201728.GC8028@zxy.spb.ru> <863de9e1-42cc-6f3a-5c1f-1bf737714c9f@FreeBSD.org> <20200203222321.GB8012@zxy.spb.ru> <6868f207-d054-3d45-b60d-eaf7115760c1@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6868f207-d054-3d45-b60d-eaf7115760c1@FreeBSD.org> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-Rspamd-Queue-Id: 48Bqfc0lMtz45wW X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-5.99 / 15.00]; NEURAL_HAM_MEDIUM(-0.99)[-0.992,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; REPLY(-4.00)[] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Feb 2020 16:20:08 -0000 On Mon, Feb 03, 2020 at 02:39:03PM -0800, Navdeep Parhar wrote: > On 2/3/20 2:23 PM, Slawa Olhovchenkov wrote: > > On Mon, Feb 03, 2020 at 01:39:52PM -0800, Navdeep Parhar wrote: > > > >> On 2/3/20 12:17 PM, Slawa Olhovchenkov wrote: > >>> I am try to use Chelsio T540-CR in netmap mode and see poor (compared > >>> to Intel 82599ES) performance. > >> > >> What approximate FreeBSD version is this? > > > > 12.1-STABLE > > > >>> > >>> Same application ac receive only about 8.9Mpss, compared to 12.5Mpps > >>> at Intel. > >>> > >>> pmc profile show mostly time spend in: > >>> > >>> 49.76% [17802] service_nm_rxq @ /boot/kernel/if_cxgbe.ko > >>> 100.0% [17802] t4_vi_intr > >>> 100.0% [17802] ithread_loop @ /boot/kernel/kernel > >>> 100.0% [17802] fork_exit > >>> > >>> > >>> to be exact at line > >>> > >>> while ((d->rsp.u.type_gen & F_RSPD_GEN) == nm_rxq->iq_gen) { > >>> > >>> Is this maximum limit for this vendor? > >> > >> No, a T540 should be able to sink full 10Gbps (14.88Mpps) on a single rx > >> queue. Try adding this to your loader.conf: > >> > >> hw.cxgbe.toecaps_allowed="0" > >> > >> Then try simple netmap "pkt-gen -f rx" instead of any custom app and see > >> how many pps it's able to sink. > > > > Thanks! `hw.cxgbe.toecaps_allowed="0"` allow recive 14Mpps for may > > application too! > > > > Now I am got only 10% less performance compared to Intel, as I see by > > higher Chelsio interrupt cpu time (top show about 30% for every > > interrupt handler). Is this normal? Is this posible to optimize? > > Try changing the interrupt holdoff timer for the netmap rx queues. > > This shows the list of timers available (in microseconds): > # sysctl dev.t5nex.0.holdoff_timers > > nm_holdoff_tmr_idx is a 0-based index into the list above. So if the > tmr idx is 0 you are using the 0th (first) value from the list of > timers. Try increasing nm_holdoff_tmr_idx and see if that brings down > the interrupt rate under control. > > # sysctl hw.cxgbe.nm_holdoff_tmr_idx=3/4/5 OK, interrupt rate go down, but interrupt time about same. (interrupt rate for intel card about 0, compared to 25% chelsio). Most time spent in service_nm_rxq(), in while() check. Is this posible to do some prefetch? Trivial `__builtin_prefetch(64+(char*)d);` in body of loop don't change anything. Is this posible to do batch prefetch before cycle?