From owner-freebsd-net@freebsd.org Wed Feb 5 11:38:34 2020 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id D5748248232 for ; Wed, 5 Feb 2020 11:38:34 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 48CKMG50CDz4RY5; Wed, 5 Feb 2020 11:38:34 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1izJ0q-000K0m-Tj; Wed, 05 Feb 2020 14:38:32 +0300 Date: Wed, 5 Feb 2020 14:38:32 +0300 From: Slawa Olhovchenkov To: Navdeep Parhar Cc: freebsd-net@freebsd.org Subject: Re: Chelsio NETMAP performance Message-ID: <20200205113832.GE8012@zxy.spb.ru> References: <20200203201728.GC8028@zxy.spb.ru> <863de9e1-42cc-6f3a-5c1f-1bf737714c9f@FreeBSD.org> <20200203222321.GB8012@zxy.spb.ru> <6868f207-d054-3d45-b60d-eaf7115760c1@FreeBSD.org> <20200204162005.GC8012@zxy.spb.ru> <3a8dfebd-aa26-84ad-a03a-0271b61a89a3@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3a8dfebd-aa26-84ad-a03a-0271b61a89a3@FreeBSD.org> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-Rspamd-Queue-Id: 48CKMG50CDz4RY5 X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-5.99 / 15.00]; NEURAL_HAM_MEDIUM(-0.99)[-0.991,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; REPLY(-4.00)[] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Feb 2020 11:38:34 -0000 On Tue, Feb 04, 2020 at 12:37:08PM -0800, Navdeep Parhar wrote: > >> nm_holdoff_tmr_idx is a 0-based index into the list above. So if the > >> tmr idx is 0 you are using the 0th (first) value from the list of > >> timers. Try increasing nm_holdoff_tmr_idx and see if that brings down > >> the interrupt rate under control. > >> > >> # sysctl hw.cxgbe.nm_holdoff_tmr_idx=3/4/5 > > > > OK, interrupt rate go down, but interrupt time about same. > > (interrupt rate for intel card about 0, compared to 25% chelsio). > > I think iflib runs a lot of stuff in taskqueues rather than the driver > ithread so the CPU accounting may vary. Use dtrace to see if Don't think this is impact: worker's CPU core w/o any syscalls and only w/ bunding workker thread and NIC irq handler show about 100% user CPU time. May be some cache-miss work performed later, at poll(2) time in case of intel driver compared to chelsio (do at interrupt time)? > netmap_rx_irq is being called by an ithread or a taskqueue to figure out > what driver does what. Can you explain some more? I am not sure about dtrace probe to use and later evaluation > Are you also transmitting a lot out of this node or is it mostly Rx? > There's no need to worry about Tx updates (and the interrupts they might > generate) if this is an Rx-mostly workload. Traffic depended. This is DDoS protection, in case of SYN-flood Tx about same as Rx. In any case Tx (as I see) is significant cheaper to Rx. x10 at least. But there are nuances in case of simultaneous. > > Most time spent in service_nm_rxq(), in while() check. > > Is this posible to do some prefetch? > > Trivial `__builtin_prefetch(64+(char*)d);` in body of loop don't > > change anything. > > > > Is this posible to do batch prefetch before cycle? > > prefetches are not possible here. That while condition is waiting for > the ownership bit of the rx descriptor to flip, indicating there is > work for the driver to do. No way to do some estimeation? Count packets pending in Rx queue?