Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 25 Jan 2017 03:09:53 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Sean Bruno <sbruno@freebsd.org>
Cc:        =?UTF-8?Q?Olivier_Cochard-Labb=c3=a9?= <olivier@freebsd.org>,  "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>,  "freebsd-current@freebsd.org" <freebsd-current@freebsd.org>,  Matthew Macy <mmacy@nextbsd.org>
Subject:   Re: HEADS-UP: IFLIB implementations of sys/dev/e1000 em, lem, igb pending
Message-ID:  <20170125023853.Q1809@besplex.bde.org>
In-Reply-To: <b2714d3a-75f7-4959-e390-8ade11d77962@freebsd.org>
References:  <30f21c75-d3a2-edcd-1999-d5ed9f970c06@freebsd.org> <574a7ac7-4842-9518-8286-a4d89a9f7a27@freebsd.org> <CA%2Bq%2BTco-dcoU8EZnDEzgoK-v2Q2=U5GF6ASMSj0kwzd_wB5xig@mail.gmail.com> <6c6cb534-73c7-464b-8af1-7445a9c0188c@freebsd.org> <1598f29d379.ea6360351471.8752933472741761813@nextbsd.org> <CA%2Bq%2BTcpUXXPEQtdMFup6EZzyCKs9Ep%2BnS5SB%2Bfm6bSJSDs34_w@mail.gmail.com> <1598f3f8588.d20017893749.339651164872952258@nextbsd.org> <1598f42ad77.eeec05be4113.9201780237587761460@nextbsd.org> <CA%2Bq%2BTcp5LwrnXt75tNpYpAr1KWx9YpLx5kMHhPR%2BYgAs__n1eA@mail.gmail.com> <159902b73ed.10775291e21533.7488368455500235608@nextbsd.org> <CA%2Bq%2BTcpHmuOGyp5A290WmUvGTnOSse7v8gj4=R8kZ=m51-_s4A@mail.gmail.com> <18abdd64-08a6-50ca-fb6b-9c01a3d7b60c@freebsd.org> <CA%2Bq%2BTcptEN5pcScYo4j3O8OuJHEacZu9ugOz_6b2iFb-CzBFXA@mail.gmail.com> <ad7fdc31-b0dd-2105-1610-cf0f3de42245@freebsd.org> <CA%2Bq%2BTcpcZUJTmycPBF9kS-x4qcqN7V3LdHW=BQ3ttzXLyU3FWw@mail.gmail.com> <b2714d3a-75f7-4959-e390-8ade11d77962@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 24 Jan 2017, Sean Bruno wrote:

> On 01/24/17 08:27, Olivier Cochard-Labb=C3=A9 wrote:
>> On Tue, Jan 24, 2017 at 3:17 PM, Sean Bruno <sbruno@freebsd.org
>> <mailto:sbruno@freebsd.org>> wrote:
>>
>>     Did you increase the number of rx/tx rings to 8 and the number of
>>     descriptors to 4k in your tests or just the defaults?
>>
>> Tuning are same as described in my previous email (rxd|txd=3D2048, rx|tx
>> process_limit=3D-1, max_interrupt_rate=3D16000).
>> [root@apu2]~# sysctl hw.igb.
>> hw.igb.tx_process_limit: -1
>> hw.igb.rx_process_limit: -1
>> hw.igb.num_queues: 0
>> hw.igb.header_split: 0
>> hw.igb.max_interrupt_rate: 16000
>> hw.igb.enable_msix: 1
>> hw.igb.enable_aim: 1
>> hw.igb.txd: 2048
>> hw.igb.rxd: 2048
>
> Oh, I think you missed my note on these.  In order to adjust txd/rxd you
> need to tweak the iflib version of these numbers.  nrxds/ntxds should be
> adjust upwards to your value of 2048.  nrxqs/ntxqs should be adjust
> upwards to 8, I think, so you can test equivalent settings to the legacy
> driver.
>
> Specifically, you may want to adjust these:
>
> dev.em.0.iflib.override_nrxds: 0
> dev.em.0.iflib.override_ntxds: 0
>
> dev.em.0.iflib.override_nrxqs: 0
> dev.em.0.iflib.override_ntxqs: 0

That is painful.

My hack to increase the ifq length also no longer works:

X Index: if_em.c
X =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
X --- if_em.c=09(revision 312696)
X +++ if_em.c=09(working copy)
X @@ -1,3 +1,5 @@
X +int em_qlenadj =3D -1;
X +

-1 gives a null adjustment; 0 gives a default (very large ifq), and other
values give a non-null adustment.

X  /*-
X   * Copyright (c) 2016 Matt Macy <mmacy@nextbsd.org>
X   * All rights reserved.
X @@ -2488,7 +2490,10 @@
X=20
X  =09/* Single Queue */
X          if (adapter->tx_num_queues =3D=3D 1) {
X -=09  if_setsendqlen(ifp, scctx->isc_ntxd[0] - 1);
X +=09  if (em_qlenadj =3D=3D 0)
X +=09    em_qlenadj =3D imax(2 * tick, 0) * 15 / 10;
X +=09    // lem_qlenadj =3D imax(2 * tick, 0) * 42 / 100;
X +=09  if_setsendqlen(ifp, scctx->isc_ntxd[0] + em_qlenadj);
X  =09  if_setsendqready(ifp);
X  =09}
X

I don't want larger hardware queues, but sometimes want larger software
queues.  ifq's used to give them.  The if_setsenqlen() call is still there.
but no longer gives them.

The large queues are needed for backet blasting benchmarks since select()
doesn't work for udp sockets, so if the queues fill up then the benchmarks
must busy-wait or sleep waiting for them to drain, and timeout granularity
tends to prevent short sleeps from working so the queues run dry while
sleeping unless the queues are very large.

Bruce
From owner-freebsd-net@freebsd.org  Tue Jan 24 16:41:32 2017
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 60E1FCC0A90
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Tue, 24 Jan 2017 16:41:32 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: from mail-wm0-x242.google.com (mail-wm0-x242.google.com
 [IPv6:2a00:1450:400c:c09::242])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id EAF18AAF;
 Tue, 24 Jan 2017 16:41:28 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: by mail-wm0-x242.google.com with SMTP id r126so35884785wmr.3;
 Tue, 24 Jan 2017 08:41:28 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:in-reply-to:references:from:date:message-id:subject:to
 :cc; bh=/1r/ydDMi3ilS0B0ilPXaFftn08qd89LgHAUzIvya6M=;
 b=RupbhwC+h+cnEuyPZYo5Le53wEoF8kp6MzHub0YI/wgnw3GMnXpKXmUXL+Ztmrg+6h
 bx98SKKUKUVD1XN/WXYC5GxBGqembGI2FEGTBRef+rjNMSFicSODlsvaljHDjuGgb8lo
 vzgSgsRDpm8rw7MdASrEnUReJ1+ck6EO5mBB4hOLcEji85lcvP3nflbrTF6NnzT+BU5I
 X3XsO5Zkv8wP6VSwq/hg62WDPE0X45Ng1wQbpB3Xlm1AQMKJDfkeMTAgDVgdChCN5O7Y
 1BGnGhxt2nKmXcBkeIW9Cw5cW+hgWbCnOVRb4yfgLCanI1obYMynIV2P8oSVwAV8Py9v
 OICQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:in-reply-to:references:from:date
 :message-id:subject:to:cc;
 bh=/1r/ydDMi3ilS0B0ilPXaFftn08qd89LgHAUzIvya6M=;
 b=CZila5neOFdR5rDKKkRPZoi7mllqcW33cP+FlI5U432FGi4U5l6U/AWRSVmQtDFnQr
 niY5tro2zRn9yZBX3r9KAhR3VbCYgA2C2QNMg/HEDmtm6Mgg/0DHON6ZEN9nqix0Dthk
 VNs1K7i3d+bjMMWhNCp4XUknFS8SsaQoIHX4+3EXU3gW1IXpPT5ZPfnk8jf0glRFXyJG
 /f/xt1jIaRMEZWqEhokFdhdAwahPgYg3BfKzHj4jR0dKHDl19tJjMPH92E0k1jCjLFYT
 GST3eov/X1/5JSWUkwkYQ8S0S1ncCbs9kbQbpxBkH/JmcrAMqUECXvPwGy85jF5vNE49
 irzw==
X-Gm-Message-State: AIkVDXJedjB6WRYuyhfZurcYLKEEaVeGlGUByxi0qcphmMPvx8h9RRRTekoK0pCHGLlYitJrKNzbyOMMZyEf8w==
X-Received: by 10.223.173.80 with SMTP id p74mr28009136wrc.168.1485276086774; 
 Tue, 24 Jan 2017 08:41:26 -0800 (PST)
MIME-Version: 1.0
Received: by 10.194.82.162 with HTTP; Tue, 24 Jan 2017 08:41:25 -0800 (PST)
In-Reply-To: <CAK7dMtCSRu6L67A46FU0eYqJ9=zGdyJPvBXLkL7gKbh=SbZ6cQ@mail.gmail.com>
References: <CAK7dMtCSRu6L67A46FU0eYqJ9=zGdyJPvBXLkL7gKbh=SbZ6cQ@mail.gmail.com>
From: Adrian Chadd <adrian.chadd@gmail.com>
Date: Tue, 24 Jan 2017 08:41:25 -0800
Message-ID: <CAJ-Vmok=NB_PrTOuJcKLobd2W9VUEELG=r2WD5+Ey_Tgp0rd0Q@mail.gmail.com>
Subject: Re: RFC: ethctl
To: Kevin Bowling <kevin.bowling@kev009.com>
Cc: FreeBSD Net <freebsd-net@freebsd.org>, Scott Long <scottl@netflix.com>, 
 "Joyner, Eric" <eric.joyner@intel.com>, Drew Gallatin <gallatin@netflix.com>, 
 Oded Shanoon <odeds@mellanox.com>, Matthew Macy <mmacy@nextbsd.org>,
 hps@freebsd.org, 
 "Cramer, Jeb J" <jeb.j.cramer@intel.com>, George Neville-Neil <gnn@freebsd.org>,
 arybchik@freebsd.org, 
 shurd@freebsd.org, Navdeep Parhar <navdeep@chelsio.com>
Content-Type: text/plain; charset=UTF-8
X-Mailman-Approved-At: Tue, 24 Jan 2017 17:17:39 +0000
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>;
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Jan 2017 16:41:32 -0000

On 19 January 2017 at 19:58, Kevin Bowling <kevin.bowling@kev009.com> wrote:
> Greetings,
>
> I'm casting a wide net in cc, try to keep the noise minimal but we need
> some input from a variety of HW vendors.
>
> I have heard from several vendors the need for a NIC configuration tool.
>  Chelsio ships a cxgb/cxgbetool in FreeBSD as one example.  There is
> precedence for some nod toward a basic unified tool in Linux ethtool.
>
> From your perspective,
> 1) What are the common requirements?
> 2) What are specialized requirements? For instance as a full TCP offload
> card Chelsio needs things others wont
> 3) What should it _not_ do?  Several of you have experience doing Ethernet
> driver dev on many platforms so we should attempt to avoid repeating past
> design mistakes.
>
> I expect we can achieve some level of inversion so the device specific code
> can live close to the driver and plug into the ethctl framework.  It should
> be general enough to add completely new top level commands, so vendors can
> implement HW specific features.  On the other hand, we should attempt to
> hook into common core for features every NIC provides, with a focus on
> iflib.
>
> I will fund Matt Macy to do the overall design and implementation.
>
> Regards,
> Kevin Bowling, on behalf of Limelight Networks for this effort

Hi,

ethtool isn't exactly complicated. It's just vaguely standardized.

When I was hoping (hah!) to do it, partly for wifi but partly for
ethernet, my goals were:

* generic interface for counter statistics, versus sysctl;
* "default" mib space for known counter statistics, versus the small
set we have now and then the very large vendor space that ethtool
(linux) and sysctl (freebsd) exposes;
* generic interface for configuring things like RSS mapping, L2/L3
rules for a NIC based on the NIC capability and what queue(s) they map
to, versus vendor tools;
* vendor specific extensions, which hopefully (!) are implemented as
generic-ish plugins where required, or just strings that can be passed
through to the driver and registered via a command hook;
* and importantly - some kind of optional debug control, because every
driver does something different for debugging / tracing, and boy it'd
be nice if it all was like wifi (wlandebug -i <interface>
+/-(feature))

A lot of what's in linux ethtool is in freebsd's ifconfig, albeit not
in a reusable/runtime-extensible fashion. It'd be nice to include say,
many more vendor counters in a somewhat generic fashion, versus how we
currently do things.

I think it'd be a good starting point to have an ethtool control iflib
drivers, so any driver using iflib can benefit from what's
implemented.

2c,

-adrian



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170125023853.Q1809>