From owner-freebsd-net@FreeBSD.ORG Sun Jul 13 23:11:46 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 887DE450 for ; Sun, 13 Jul 2014 23:11:46 +0000 (UTC) Received: from mail-pd0-x22a.google.com (mail-pd0-x22a.google.com [IPv6:2607:f8b0:400e:c02::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5DB8E2214 for ; Sun, 13 Jul 2014 23:11:46 +0000 (UTC) Received: by mail-pd0-f170.google.com with SMTP id g10so1977211pdj.15 for ; Sun, 13 Jul 2014 16:11:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=Izdt+WdIDQ/wVK+Yxm8nd2C35XgQDVyevwp4eC38qQU=; b=uAMNKHeh1VNYVjDyDW8HfA4iIRZxhiti8tKMk8O3qAJuxFvyoMqMb5G7eToB/ANPxt MI1LUVXKd7Jmhzj0StGRGicoYBYEqf8if/IoU5g/aosGGgUoia8ZTn1h1+NWQKIbzVt9 vw1b4DOJg9qaJBHiqy56CNeDAQHdSncosgWsQF2EAeYLnQPhk8a2f02CUL3jY8YnFyQx JCkcAp133CFDAnMp3GirdSllVLg/Cu+OAsoen3xXmdSU7Y4eEBO+29u6vP8UJgypD+aL 5DTNPT/8mgEAMVDOktWAdIn2oOSv2+obPJsAgersaAm9teIls9vK9FyOCWMCyi9Zsr0k /cxw== X-Received: by 10.66.254.166 with SMTP id aj6mr13354572pad.11.1405293105964; Sun, 13 Jul 2014 16:11:45 -0700 (PDT) Received: from ox ([24.6.44.228]) by mx.google.com with ESMTPSA id wn7sm38102127pab.18.2014.07.13.16.11.44 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Sun, 13 Jul 2014 16:11:45 -0700 (PDT) Date: Sun, 13 Jul 2014 16:11:40 -0700 From: Navdeep Parhar To: John Jasem Subject: Re: tuning routing using cxgbe and T580-CR cards? Message-ID: <20140713231140.GA8690@ox> Mail-Followup-To: John Jasem , FreeBSD Net References: <53C01EB5.6090701@gmail.com> <53C03BB4.2090203@gmail.com> <53C0882D.5070100@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <53C0882D.5070100@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 13 Jul 2014 23:11:46 -0000 On Fri, Jul 11, 2014 at 08:58:21PM -0400, John Jasem wrote: > > On 07/11/2014 03:32 PM, Navdeep Parhar wrote: > > On 07/11/14 10:28, John Jasem wrote: > >> In testing two Chelsio T580-CR dual port cards with FreeBSD 10-STABLE, > >> I've been able to use a collection of clients to generate approximately > >> 1.5-1.6 million TCP packets per second sustained, and routinely hit > >> 10GB/s, both measured by netstat -d -b -w1 -W (I usually use -h for the > >> quick read, accepting the loss of granularity). > > When forwarding, the pps rate is often more interesting, and almost > > always the limiting factor, as compared to the total amount of data > > being passed around. 10GB at this pps probably means 9000 MTU. Try > > with 1500 too if possible. > > Yes, I am generally more interested/concerned with the pps. Using > 1500-sized packets, I've seen around 2 million pps. I'll have hard > numbers for the list, with netstat and vmstat output Monday. Thanks! If possible, please try with even lower packet sizes (128B, 512B, whatever your clients are good at). You may have to disable Nagle (TCP_NODELAY option) on your clients to get small TCP packets out of them. Or you could just switch to UDP for pps testing. If all your incoming traffic is received on a single port then try setting hw.cxgbe.nrxq10g=12 in /boot/loader.conf. (You mentioned elsewhere this is a system with 12 real cores). Regards, Navdeep > > > > >> a) One of the first things I did in prior testing was to turn > >> hyperthreading off. I presume this is still prudent, as HT doesn't help > >> with interrupt handling? > > It is always worthwhile to try your workload with and without > > hyperthreading. > > Testing Mellanox cards, HT was severely detrimental. However, in almost > every case so far, Mellanox and Chelsio have resulted in opposite > conclusions (cpufreq, net.isr.*). > > >> c) the defaults for the cxgbe driver appear to be 8 rx queues, and N tx > >> queues, with N being the number of CPUs detected. For a system running > >> multiple cards, routing or firewalling, does this make sense, or would > >> balancing tx and rx be more ideal? And would reducing queues per card > >> based on NUMBER-CPUS and NUM-CHELSIO-PORTS make sense at all? > > The defaults are nrxq = min(8, ncores) and ntxq = min(16, ncores). The > > man page mentions this. The reason for 8 vs. 16 is that tx queues are > > "cheaper" as they don't have to be backed by rx buffers. It only needs > > some memory for the tx descriptor ring and some hardware resources. > > > > It appears that your system has >= 16 cores. For forwarding it probably > > makes sense to have nrxq = ntxq. If you're left with 8 or fewer cores > > after disabling hyperthreading you'll automatically get 8 rx and tx > > queues. Otherwise you'll have to fiddle with the hw.cxgbe.nrxq10g and > > ntxq10g tunables (documented in the man page). > > I promise I did look through the man page before posting. :) This is > actually a 12 core box with HT turned off. > > Mining the cxl stat entries in sysctl, it appears that the queues per > port are reasonably well balanced, so I may be concerned over nothing. > > > > >> g) Are there other settings I should be looking at, that may squeeze out > >> a few more packets? > > The pps rates that you've observed are within the chip's hardware limits > > by at least an order of magnitude. Tuning the kernel rather than the > > driver may be the best bang for your buck. > > If I am missing obvious configurations for kernel tuning in this regard, > it would not the be the first time. > > Thanks again! > > -- John Jasen (jjasen@gmail.com) >