From owner-freebsd-net@freebsd.org  Tue Aug 11 22:01:43 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4CEA899FFE3
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Tue, 11 Aug 2015 22:01:43 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: from mail-ig0-x231.google.com (mail-ig0-x231.google.com
 [IPv6:2607:f8b0:4001:c05::231])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 1A8BEFDB;
 Tue, 11 Aug 2015 22:01:43 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: by igfj19 with SMTP id j19so1471089igf.1;
 Tue, 11 Aug 2015 15:01:42 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=E35GSXuP7eghh1Uf9uGqxo6RxRqin4LqJWzkfzXJBKE=;
 b=J2fuj6RjqWehFU/8Bnx5R0DXpxDe5TRSk055495XhvNiZ/z2SKJ5vO7ezCHxdDAp86
 czbCKOiZMz5K/cAqPJI+Ymafd/b5hKEBGNgQvnsQw8ezT/4tqp7naMyLWL1jPM/x2yt7
 TblvnGUOv65nOc+3R2UTJ9ANL1qBtowoZLKalRg+MClI2i8gw125R2BqCBmkwZSpaeiB
 7krNH7jIGAmitcsYIfmMZwRVsy4G1Q381aHY/xNJHwPOUVShWtcsE74HrcS/JI9ae99X
 V9K0ZQdgNzqNN1bF4BtiiwIKzfnTgRAZam0XvfEo3C0GsPuQWx6/aKgVOOclaSjS5ypI
 ikaw==
MIME-Version: 1.0
X-Received: by 10.50.61.144 with SMTP id p16mr19490443igr.22.1439330502634;
 Tue, 11 Aug 2015 15:01:42 -0700 (PDT)
Received: by 10.36.38.133 with HTTP; Tue, 11 Aug 2015 15:01:42 -0700 (PDT)
In-Reply-To: <CAH7qZftMB34PM1CqNhdg7AWhsq6YknUDgc60ASfT2Z0L1z8XCQ@mail.gmail.com>
References: <CAH7qZftMB34PM1CqNhdg7AWhsq6YknUDgc60ASfT2Z0L1z8XCQ@mail.gmail.com>
Date: Tue, 11 Aug 2015 15:01:42 -0700
Message-ID: <CAJ-Vmo=7XzE0SYfG__Y7qee9jZ1qKOOuNPY2TFPJfD2-06Mk5g@mail.gmail.com>
Subject: Re: Poor high-PPS performance of the 10G ixgbe(9) NIC/driver in
 FreeBSD 10.1
From: Adrian Chadd <adrian.chadd@gmail.com>
To: Maxim Sobolev <sobomax@freebsd.org>
Cc: FreeBSD Net <freebsd-net@freebsd.org>, freebsd@intel.com, 
 =?UTF-8?Q?Jev_Bj=C3=B6rsell?= <jev@sippysoft.com>
Content-Type: text/plain; charset=UTF-8
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Aug 2015 22:01:43 -0000

hi,

Are you able to graph per-queue interrupt rates?

It looks like the traffic is distributed differently (the first two
queues are taking interrupts).

Does 10.1 have the flow director code disabled? I remember there was
some .. interesting behaviour with ixgbe where it'd look at traffic
and set up flow director rules to try and "balance" things. It was
buggy and programmed the hardware badly, so we disabled it in at least
-HEAD.


-adrian


On 11 August 2015 at 14:18, Maxim Sobolev <sobomax@freebsd.org> wrote:
> Hi folks,
>
> We've trying to migrate some of our high-PPS systems to a new hardware that
> has four X540-AT2 10G NICs and observed that interrupt time goes through
> roof after we cross around 200K PPS in and 200K out (two ports in LACP).
> The previous hardware was stable up to about 350K PPS in and 350K out. I
> believe the old one was equipped with the I350 and had the identical LACP
> configuration. The new box also has better CPU with more cores (i.e. 24
> cores vs. 16 cores before). CPU itself is 2 x E5-2690 v3.
>
> After hitting this limit with the default settings, I've tried to tweak the
> following settings:
>
> hw.ix.rx_process_limit="-1"
> hw.ix.tx_process_limit="-1"
> hw.ix.enable_aim="0"
> hw.ix.max_interrupt_rate="-1"
> hw.ix.rxd="4096"
> hw.ix.txd="4096"
>
> dev.ix.0.fc=0
> dev.ix.1.fc=0
> dev.ix.2.fc=0
> dev.ix.3.fc=0
>
> hw.intr_storm_threshold=0
>
> But there is little or no effect on the performance. The workload is just
> lot of small UDP packets being relayed between bunch of hosts. The symptoms
> are always the same - the box runs nice and cool until it his the said PPS
> threshold, with kernel spending just few percent in the interrupts and then
> it jumps straight to 100% interrupt time, thereby scaring some traffic away
> due to packet loss and such, so that the load drops and the system goes
> into the "cool" state again. It looks very much like some contention in the
> driver or in the hardware. Linked are some monitoring screenshots
> displaying the issue unfolding as well as systat -vm screenshots from the
> "cool" state.
>
> http://sobomax.sippysoft.com/ScreenShot387.png <- CPU utilization right
> before the "bang event"
> http://sobomax.sippysoft.com/ScreenShot382.png <- issue itself
> http://sobomax.sippysoft.com/ScreenShot385.png <- systat -vm few minutes
> after traffic declined somewhat
>
> We are now trying to get customer install 1Gig NIC so that we can run it
> and compare performance with the rest of the hardware and software being
> essentially the same.
>
> Any ideas on how to improve/resolve this problem are welcome. Thanks!
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"