From owner-freebsd-net@freebsd.org  Sat Oct 19 16:35:28 2019
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.nyi.freebsd.org (Postfix) with ESMTP id 40E6B151268;
 Sat, 19 Oct 2019 16:35:28 +0000 (UTC)
 (envelope-from michael.tuexen@lurchi.franken.de)
Received: from drew.franken.de (drew.ipv6.franken.de
 [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "*.franken.de",
 Issuer "COMODO RSA Domain Validation Secure Server CA" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 46wT665C13z4Px4;
 Sat, 19 Oct 2019 16:35:26 +0000 (UTC)
 (envelope-from michael.tuexen@lurchi.franken.de)
Received: from [IPv6:2a02:8109:1140:c3d:e070:ad51:e71e:6446] (unknown
 [IPv6:2a02:8109:1140:c3d:e070:ad51:e71e:6446])
 (Authenticated sender: lurchi)
 by drew.franken.de (Postfix) with ESMTPSA id 84F2C721E281A;
 Sat, 19 Oct 2019 18:35:21 +0200 (CEST)
Content-Type: text/plain;
	charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3594.4.19\))
Subject: Re: Network anomalies after update from 11.2 STABLE to 12.1 STABLE
From: Michael Tuexen <michael.tuexen@lurchi.franken.de>
In-Reply-To: <1571499556.409350000.a1ewtyar@frv39.fwdcdn.com>
Date: Sat, 19 Oct 2019 18:35:20 +0200
Cc: freebsd-net@freebsd.org,
 freebsd-stable@freebsd.org
Content-Transfer-Encoding: quoted-printable
Message-Id: <F80C784B-1653-4CEB-B131-E7FAC5F55675@lurchi.franken.de>
References: <1571499556.409350000.a1ewtyar@frv39.fwdcdn.com>
To: Paul <devgs@ukr.net>
X-Mailer: Apple Mail (2.3594.4.19)
X-Spam-Status: No, score=-1.7 required=5.0 tests=ALL_TRUSTED,BAYES_00,
 NUMERIC_HTTP_ADDR autolearn=disabled version=3.4.1
X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail-n.franken.de
X-Rspamd-Queue-Id: 46wT665C13z4Px4
X-Spamd-Bar: /
Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none;
 spf=none (mx1.freebsd.org: domain of michael.tuexen@lurchi.franken.de has no
 SPF policy when checking 2001:638:a02:a001:20e:cff:fe4a:feaa)
 smtp.mailfrom=michael.tuexen@lurchi.franken.de
X-Spamd-Result: default: False [-0.81 / 15.00]; ARC_NA(0.00)[];
 RCVD_VIA_SMTP_AUTH(0.00)[]; RCVD_COUNT_TWO(0.00)[2];
 NEURAL_HAM_MEDIUM(-0.70)[-0.700,0]; FROM_HAS_DN(0.00)[];
 RCPT_COUNT_THREE(0.00)[3]; TO_DN_SOME(0.00)[]; MV_CASE(0.50)[];
 MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[franken.de];
 AUTH_NA(1.00)[]; NEURAL_HAM_LONG(-0.99)[-0.988,0];
 TO_MATCH_ENVRCPT_SOME(0.00)[];
 IP_SCORE(-0.42)[ip: (-4.45), ipnet: 2001:638::/32(2.25), asn: 680(0.11),
 country: DE(-0.01)]; R_SPF_NA(0.00)[];
 FREEMAIL_TO(0.00)[ukr.net];
 RCVD_IN_DNSWL_LOW(-0.10)[a.a.e.f.a.4.e.f.f.f.c.0.e.0.2.0.1.0.0.a.2.0.a.0.8.3.6.0.1.0.0.2.list.dnswl.org
 : 127.0.5.1]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+];
 ASN(0.00)[asn:680, ipnet:2001:638::/32, country:DE];
 MID_RHS_MATCH_FROM(0.00)[]; RCVD_TLS_ALL(0.00)[];
 FROM_EQ_ENVFROM(0.00)[]
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 19 Oct 2019 16:35:28 -0000

> On 19. Oct 2019, at 18:09, Paul <devgs@ukr.net> wrote:
>=20
> Hi Michael,
>=20
> Thank you, for taking your time!
>=20
> We use physical machines. We don not have any special `pf` rules.=20
> Both sides ran `pfctl -d` before testing.
Hi Paul,

OK. How are the physical machines connected to each other?

What happens when you don't use a lagg interface, but the physical ones?

(Trying to localise the problem...)

Best regards
Michael
>=20
>=20
> `nginx` config is primitive, no secrets there:
>=20
> -------------------------------------------------------------------
> user  www;
> worker_processes  auto;
>=20
> error_log  /var/log/nginx/error.log warn;
>=20
> events {
>     worker_connections  81920;
>     kqueue_changes  4096;
>     use kqueue;
> }
>=20
> http {
>     include                     mime.types;
>     default_type                application/octet-stream;
>=20
>     sendfile                    off;
>     keepalive_timeout           65;
>     tcp_nopush                  on;
>     tcp_nodelay                 on;
>=20
>     # Logging
>     log_format  main            '$remote_addr - $remote_user =
[$time_local] "$request" '
>                                 '$status $request_length =
$body_bytes_sent "$http_referer" '
>                                 '"$http_user_agent" "$http_x_real_ip" =
"$realip_remote_addr" "$request_completion" "$request_time" '
>                                 '"$request_body"';
>=20
>     access_log                  /var/log/nginx/access.log  main;
>=20
>     server {
>         listen                  80 default;
>=20
>         server_name             localhost _;
>=20
>         location / {
>             return 404;
>         }
>     }
> }
> -------------------------------------------------------------------
>=20
>=20
> `wrk` is compiled with a default configuration. We test like this:
>=20
> `wrk -c 10 --header "Connection: close" -d 10 -t 1 --latency =
http://10.10.10.92:80/missing`
>=20
>=20
> Also, it seems that our issue, and the one described in this thread, =
are identical:
>=20
>    =
https://lists.freebsd.org/pipermail/freebsd-net/2019-June/053667.html
>=20
> We both have the Intel network cards, BTW. Our network cards are =
these:
>=20
> em0 at pci0:10:0:0:        class=3D0x020000 card=3D0x000015d9 =
chip=3D0x10d38086 rev=3D0x00 hdr=3D0x00
>     vendor     =3D 'Intel Corporation'
>     device     =3D '82574L Gigabit Network Connection'
>=20
> ixl0 at pci0:4:0:0:        class=3D0x020000 card=3D0x00078086 =
chip=3D0x15728086 rev=3D0x01 hdr=3D0x00
>     vendor     =3D 'Intel Corporation'
>     device     =3D 'Ethernet Controller X710 for 10GbE SFP+'
>=20
>=20
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D
>=20
> Additional info:
>=20
> During the tests, we have bonded two interfaces into a lagg:
>=20
> ixl0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 =
mtu 1500
>         =
options=3Dc500b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILT=
ER,VLAN_HWTSO,TXCSUM_IPV6>
>         ether 3c:fd:fe:aa:60:20
>         media: Ethernet autoselect (10Gbase-SR <full-duplex>)
>         status: active
>         nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
> ixl1: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 =
mtu 1500
>         =
options=3Dc500b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILT=
ER,VLAN_HWTSO,TXCSUM_IPV6>
>         ether 3c:fd:fe:aa:60:20
>         hwaddr 3c:fd:fe:aa:60:21
>         media: Ethernet autoselect (10Gbase-SR <full-duplex>)
>         status: active
>         nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>=20
>=20
> lagg0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 =
mtu 1500
>         =
options=3Dc500b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILT=
ER,VLAN_HWTSO,TXCSUM_IPV6>
>         ether 3c:fd:fe:aa:60:20
>         inet 10.10.10.92 netmask 0xffff0000 broadcast 10.10.255.255
>         laggproto failover lagghash l2,l3,l4
>         laggport: ixl0 flags=3D5<MASTER,ACTIVE>
>         laggport: ixl1 flags=3D0<>
>         groups: lagg
>         media: Ethernet autoselect
>         status: active
>         nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>=20
> using this config:
>=20
>     ifconfig_ixl0=3D"up -lro -tso -rxcsum -txcsum"  (tried different =
options - got the same outcome)
>     ifconfig_ixl1=3D"up -lro -tso -rxcsum -txcsum"
>     ifconfig_lagg0=3D"laggproto failover laggport ixl0 laggport ixl1 =
10.10.10.92/24"
>=20
>=20
> We have randomly picked `ixl0` and restricted number of RX/TX queues =
to 1:
>     /boot/loader.conf :
>     dev.ixl.0.iflib.override_ntxqs=3D1
>     dev.ixl.0.iflib.override_nrxqs=3D1
>=20
> leaving `ixl1` with a default number, matching number of cores (6).
>=20
>=20
>     ixl0: <Intel(R) Ethernet Controller X710 for 10GbE SFP+ - 2.1.0-k> =
mem 0xf8800000-0xf8ffffff,0xf9808000-0xf980ffff irq 40 at device 0.0 on =
pci4
>     ixl0: fw 5.0.40043 api 1.5 nvm 5.05 etid 80002927 oem 1.261.0
>     ixl0: PF-ID[0]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, I2C
>     ixl0: Using 1024 TX descriptors and 1024 RX descriptors
>     ixl0: Using 1 RX queues 1 TX queues
>     ixl0: Using MSI-X interrupts with 2 vectors
>     ixl0: Ethernet address: 3c:fd:fe:aa:60:20
>     ixl0: Allocating 1 queues for PF LAN VSI; 1 queues active
>     ixl0: PCI Express Bus: Speed 8.0GT/s Width x4
>     ixl0: SR-IOV ready
>     ixl0: netmap queues/slots: TX 1/1024, RX 1/1024
>     ixl1: <Intel(R) Ethernet Controller X710 for 10GbE SFP+ - 2.1.0-k> =
mem 0xf8000000-0xf87fffff,0xf9800000-0xf9807fff irq 40 at device 0.1 on =
pci4
>     ixl1: fw 5.0.40043 api 1.5 nvm 5.05 etid 80002927 oem 1.261.0
>     ixl1: PF-ID[1]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, I2C
>     ixl1: Using 1024 TX descriptors and 1024 RX descriptors
>     ixl1: Using 6 RX queues 6 TX queues
>     ixl1: Using MSI-X interrupts with 7 vectors
>     ixl1: Ethernet address: 3c:fd:fe:aa:60:21
>     ixl1: Allocating 8 queues for PF LAN VSI; 6 queues active
>     ixl1: PCI Express Bus: Speed 8.0GT/s Width x4
>     ixl1: SR-IOV ready
>     ixl1: netmap queues/slots: TX 6/1024, RX 6/1024
>=20
>=20
> This allowed us easy switch between different configurations without
> the need to reboot, by simply shutting down one interface or the =
other:
>=20
>     `ifconfig XXX down`
>=20
> When testing `ixl0` that runs only a single queue:
>     ixl0: Using 1 RX queues 1 TX queues
>     ixl0: netmap queues/slots: TX 1/1024, RX 1/1024
>=20
> we've got these results:
>=20
> `wrk -c 10 --header "Connection: close" -d 10 -t 1 --latency =
http://10.10.10.92:80/missing`
> Running 10s test @ http://10.10.10.92:80/missing
>   1 threads and 10 connections
>   Thread Stats   Avg      Stdev     Max   +/- Stdev
>     Latency   281.31us  297.74us  22.66ms   99.70%
>     Req/Sec    19.91k     2.79k   21.25k    97.59%
>   Latency Distribution
>      50%  266.00us
>      75%  309.00us
>      90%  374.00us
>      99%  490.00us
>   164440 requests in 10.02s, 47.52MB read
>   Socket errors: read 0, write 0, timeout 0
>   Non-2xx or 3xx responses: 164440
> Requests/sec:  16412.09
> Transfer/sec:      4.74MB
>=20
>=20
> When testing `ixl1` that runs 6 queues:
>     ixl1: Using 6 RX queues 6 TX queues
>     ixl1: netmap queues/slots: TX 6/1024, RX 6/1024
>=20
> we've got these results:
>=20
> `wrk -c 10 --header "Connection: close" -d 10 -t 1 --latency =
http://10.10.10.92:80/missing`
> Running 10s test @ http://10.10.10.92:80/missing
>   1 threads and 10 connections
>   Thread Stats   Avg      Stdev     Max   +/- Stdev
>     Latency   216.16us   71.97us 511.00us   47.56%
>     Req/Sec     4.34k     2.76k   15.44k    83.17%
>   Latency Distribution
>      50%  216.00us
>      75%  276.00us
>      90%  312.00us
>      99%  365.00us
>   43616 requests in 10.10s, 12.60MB read
>   Socket errors: connect 0, read 24, write 8, timeout 0
>   Non-2xx or 3xx responses: 43616
> Requests/sec:   4318.26
> Transfer/sec:      1.25MB
>=20
> Do note, that, not only multiple queues cause issues they also =
dramatically =20
> decrease the performance of the network.=20
>=20
> Using `sysctl -w net.inet.tcp.ts_offset_per_conn=3D0` didn't help at =
all.
>=20
> Best regards,
> -Paul
>=20
>=20