Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 24 Sep 2022 17:30:16 -0700
From:      John Fieber <jrf@ursamaris.org>
To:        mike tancsa <mike@sentex.net>
Cc:        "Pieper, Jeffrey E" <jeffrey.e.pieper@intel.com>, Jim King <jim@jimking.net>, "stable@freebsd.org" <stable@freebsd.org>, "kbowling@FreeBSD.org" <kbowling@FreeBSD.org>
Subject:   Re: igc problems with heavy traffic (update)
Message-ID:  <7DA72BB5-F4F1-4AF8-AD1C-CF68908CF723@ursamaris.org>
In-Reply-To: <c1b4194f-ea8e-26ef-d923-d4344ca239c1@sentex.net>
References:  <fc256428-3ff1-68ba-cfcc-a00ca427e85b@jimking.net> <59b9cec0-d8c2-ce72-b5e9-99d1a1e807f8@sentex.net> <e714cd76-0aaa-3ea0-3c31-5e61badffa18@sentex.net> <86995d10-af63-d053-972e-dd233029f3bf@jimking.net> <3d874f65-8ce2-8f06-f19a-14cd550166e3@sentex.net> <a8192d60-2970-edb5-ce1a-c17ea875bf07@jimking.net> <fd1e825b-c306-64b1-f9ef-fec0344a9c95@sentex.net> <a4ddc96a-3dd5-4fee-8003-05f228d10858@jimking.net> <MW4PR11MB5890493674ADD1757BB47075D0659@MW4PR11MB5890.namprd11.prod.outlook.com> <a9935ba0-9cb2-5a41-ca73-b6962fef5e4d@sentex.net> <879b9239-2b9a-f0ae-4173-4a226c84cd85@sentex.net> <c1b4194f-ea8e-26ef-d923-d4344ca239c1@sentex.net>

next in thread | previous in thread | raw e-mail | index | archive | help
> On Sep 14, 2022, at 8:03 AM, mike tancsa <mike@sentex.net> wrote:
>=20
> OK, an update hence the top post. I got a new pair of boxes which use =
a different Jasper Lake chipset and have i226-V vs the i225 of the =
previous box.
>=20
> dev.igc.0.%parent: pci2
> dev.igc.0.%pnpinfo: vendor=3D0x8086 device=3D0x125c subvendor=3D0x8086 =
subdevice=3D0x0000 class=3D0x020000
> dev.igc.0.%location: slot=3D0 function=3D0 dbsf=3Dpci0:2:0:0 =
handle=3D\_SB_.PC00.RP05.PXSX
> dev.igc.0.%driver: igc
> dev.igc.0.%desc: Intel(R) Ethernet Controller I226-V
> dev.igc.%parent:
>=20
> WIth a default RELENG_13, out of the box with no tweaks, I am NOT able =
to cause the transmitting nic to bounce with heave traffic. I used the =
same test script (a constant stream of iperf3 alternating in direction) =
maxing out the NIC's bandwidth and all seems fine running the test for =
some 18hrs.  Maybe something different about the i225 version of this =
NIC that needs some different driver defaults ?
>=20
>     ---Mike
>=20

I also see this behavior with 13.1-RELEASE-p2 on:

CPU: Intel(R) Celeron(R) J4125 CPU @ 2.00GHz (1996.80-MHz K8-class CPU)
  Origin=3D"GenuineIntel"  Id=3D0x706a8  Family=3D0x6  Model=3D0x7a  =
Stepping=3D8

NIC (x4):

dev.igc.0.%parent: pci1
dev.igc.0.%pnpinfo: vendor=3D0x8086 device=3D0x15f3 subvendor=3D0x8086 =
subdevice=3D0x0000 class=3D0x020000
dev.igc.0.%location: slot=3D0 function=3D0 dbsf=3Dpci0:1:0:0 =
handle=3D\_SB_.PCI0.RP03.PXSX
dev.igc.0.%driver: igc
dev.igc.0.%desc: Intel(R) Ethernet Controller I225-V

Twidding EEE doesn=E2=80=99t seem to affect it, disabling flow control =
helps a bit, but not really a meaningful amount.

Tests were done through a tp-link TL-SG3210XHP-M2 switch, with the other =
party being 13.1-RELEASE-p2 on a 10gb DAC connection (ixl driver).

For comparison, loading up a variety of things in bhyve (with pci pass =
through of a nic) these all showed the same problem, with the interface =
bouncing multiple times inside of a 5-minute iperf3 test, same as the =
host:

- FreeBSD-13.1-STABLE-amd64-20220923
- OPNsense-22.7
- pfSense-CE 2.7-DEVLOPMENT-latest

These, however, offer unflappable performance:

- FreeBSD-14.0-CURRENT-amd64-20220923
- vyos-1.4 (for reference, what I mostly use on this hardware, via =
bhyve)

-john


>=20
> On 8/12/2022 11:04 AM, mike tancsa wrote:
>>=20
>> On 8/10/2022 3:53 PM, mike tancsa wrote:
>>> On 8/10/2022 1:47 PM, Pieper, Jeffrey E wrote:
>>>>=20
>>>> You could try disabling EEE (Energy Efficient Ethernet). Something =
like: sysctl dev.igc.0.eee_control=3D0.
>>>=20
>>>=20
>>> It does not seem to make a difference. If I have the FC as default, =
I get the link bounce on the 2.5G xover (cat 6 cable) maybe 2-3 min in =
running iper3 tests.  However, if I disable all flow control
>>>=20
>>> dev.igc.0.fc=3D0
>>> dev.igc.1.fc=3D0
>>> dev.igc.2.fc=3D0
>>> dev.igc.3.fc=3D0
>>>=20
>>> It *seems* to be less frequent but still happens.  I ordered a 2.5 G =
switch so I can try and at least see which side is dropping the link. =
Should have it Friday to continue testing
>>>=20
>>=20
>> OK, I repeated the tests with a 2.5G unmanaged switch in between the =
two units rather than xover. It looks like its the server that is =
sending the majority of the packets that drops the link, not the =
receiver.
>>=20
>> One other test I did was to up hw.igc.max_interrupt_rate=3D13000 from =
the default of 8000. That seems to make the problem MUCH more acute.
>>=20
>> Here is the before and after of the link drop.
>>=20
>>  dev.igc.1.wake: 0
>>  dev.igc.1.interrupts.rx_desc_min_thresh: 0
>> -dev.igc.1.interrupts.asserts: 65
>> +dev.igc.1.interrupts.asserts: 4879479
>>  dev.igc.1.mac_stats.tso_txd: 0
>> -dev.igc.1.mac_stats.tx_frames_1024_1522: 3
>> -dev.igc.1.mac_stats.tx_frames_512_1023: 1
>> -dev.igc.1.mac_stats.tx_frames_256_511: 2
>> -dev.igc.1.mac_stats.tx_frames_128_255: 15
>> -dev.igc.1.mac_stats.tx_frames_65_127: 2
>> +dev.igc.1.mac_stats.tx_frames_1024_1522: 12973065
>> +dev.igc.1.mac_stats.tx_frames_512_1023: 58
>> +dev.igc.1.mac_stats.tx_frames_256_511: 107
>> +dev.igc.1.mac_stats.tx_frames_128_255: 1215725
>> +dev.igc.1.mac_stats.tx_frames_65_127: 192
>>  dev.igc.1.mac_stats.tx_frames_64: 1
>>  dev.igc.1.mac_stats.mcast_pkts_txd: 0
>>  dev.igc.1.mac_stats.bcast_pkts_txd: 1
>> -dev.igc.1.mac_stats.good_pkts_txd: 24
>> -dev.igc.1.mac_stats.total_pkts_txd: 24
>> -dev.igc.1.mac_stats.good_octets_txd: 7674
>> -dev.igc.1.mac_stats.good_octets_recvd: 6492
>> -dev.igc.1.mac_stats.rx_frames_1024_1522: 2
>> -dev.igc.1.mac_stats.rx_frames_512_1023: 1
>> -dev.igc.1.mac_stats.rx_frames_256_511: 2
>> -dev.igc.1.mac_stats.rx_frames_128_255: 15
>> -dev.igc.1.mac_stats.rx_frames_65_127: 2
>> +dev.igc.1.mac_stats.good_pkts_txd: 14189148
>> +dev.igc.1.mac_stats.total_pkts_txd: 14189148
>> +dev.igc.1.mac_stats.good_octets_txd: 19450753554
>> +dev.igc.1.mac_stats.good_octets_recvd: 14933399426
>> +dev.igc.1.mac_stats.rx_frames_1024_1522: 9823228
>> +dev.igc.1.mac_stats.rx_frames_512_1023: 3
>> +dev.igc.1.mac_stats.rx_frames_256_511: 62
>> +dev.igc.1.mac_stats.rx_frames_128_255: 2365665
>> +dev.igc.1.mac_stats.rx_frames_65_127: 213
>>  dev.igc.1.mac_stats.rx_frames_64: 1
>>  dev.igc.1.mac_stats.mcast_pkts_recvd: 0
>>  dev.igc.1.mac_stats.bcast_pkts_recvd: 0
>> -dev.igc.1.mac_stats.good_pkts_recvd: 23
>> -dev.igc.1.mac_stats.total_pkts_recvd: 23
>> +dev.igc.1.mac_stats.good_pkts_recvd: 12189172
>> +dev.igc.1.mac_stats.total_pkts_recvd: 12189172
>>  dev.igc.1.mac_stats.xoff_txd: 0
>>  dev.igc.1.mac_stats.xoff_recvd: 0
>>  dev.igc.1.mac_stats.xon_txd: 0
>>  dev.igc.1.mac_stats.single_coll: 0
>>  dev.igc.1.mac_stats.excess_coll: 0
>>  dev.igc.1.queue_rx_3.rx_irq: 0
>> -dev.igc.1.queue_rx_3.rxd_tail: 21
>> -dev.igc.1.queue_rx_3.rxd_head: 22
>> +dev.igc.1.queue_rx_3.rxd_tail: 498
>> +dev.igc.1.queue_rx_3.rxd_head: 499
>>  dev.igc.1.queue_rx_2.rx_irq: 0
>>  dev.igc.1.queue_rx_2.rxd_tail: 128
>>  dev.igc.1.queue_rx_2.rxd_head: 0
>>  dev.igc.1.queue_rx_0.rxd_tail: 0
>>  dev.igc.1.queue_rx_0.rxd_head: 1
>>  dev.igc.1.queue_tx_3.tx_irq: 0
>> -dev.igc.1.queue_tx_3.txd_tail: 0
>> -dev.igc.1.queue_tx_3.txd_head: 0
>> +dev.igc.1.queue_tx_3.txd_tail: 746
>> +dev.igc.1.queue_tx_3.txd_head: 746
>>  dev.igc.1.queue_tx_2.tx_irq: 0
>> -dev.igc.1.queue_tx_2.txd_tail: 0
>> -dev.igc.1.queue_tx_2.txd_head: 0
>> +dev.igc.1.queue_tx_2.txd_tail: 186
>> +dev.igc.1.queue_tx_2.txd_head: 186
>>  dev.igc.1.queue_tx_1.tx_irq: 0
>> -dev.igc.1.queue_tx_1.txd_tail: 0
>> -dev.igc.1.queue_tx_1.txd_head: 0
>> +dev.igc.1.queue_tx_1.txd_tail: 520
>> +dev.igc.1.queue_tx_1.txd_head: 520
>>  dev.igc.1.queue_tx_0.tx_irq: 0
>> -dev.igc.1.queue_tx_0.txd_tail: 45
>> -dev.igc.1.queue_tx_0.txd_head: 45
>> +dev.igc.1.queue_tx_0.txd_tail: 777
>> +dev.igc.1.queue_tx_0.txd_head: 777
>>  dev.igc.1.fc_low_water: 32752
>>  dev.igc.1.fc_high_water: 32768
>>  dev.igc.1.rx_control: 71335938
>>  dev.igc.1.device_control: 404489793
>>  dev.igc.1.watchdog_timeouts: 0
>>  dev.igc.1.rx_overruns: 0
>> -dev.igc.1.link_irq: 2
>> +dev.igc.1.link_irq: 4
>>  dev.igc.1.dropped: 0
>>  dev.igc.1.eee_control: 0
>>  dev.igc.1.itr: 488
>>  dev.igc.1.nvm: -1
>>  dev.igc.1.iflib.rxq3.rxq_fl0.buf_size: 2048
>>  dev.igc.1.iflib.rxq3.rxq_fl0.credits: 1023
>> -dev.igc.1.iflib.rxq3.rxq_fl0.cidx: 22
>> -dev.igc.1.iflib.rxq3.rxq_fl0.pidx: 21
>> +dev.igc.1.iflib.rxq3.rxq_fl0.cidx: 499
>> +dev.igc.1.iflib.rxq3.rxq_fl0.pidx: 498
>>  dev.igc.1.iflib.rxq3.cpu: 3
>>  dev.igc.1.iflib.rxq2.rxq_fl0.buf_size: 2048
>>  dev.igc.1.iflib.rxq2.rxq_fl0.credits: 128
>>  dev.igc.1.iflib.txq3.r_abdications: 0
>>  dev.igc.1.iflib.txq3.r_restarts: 0
>>  dev.igc.1.iflib.txq3.r_stalls: 0
>> -dev.igc.1.iflib.txq3.r_starts: 0
>> +dev.igc.1.iflib.txq3.r_starts: 6175093
>>  dev.igc.1.iflib.txq3.r_drops: 0
>> -dev.igc.1.iflib.txq3.r_enqueues: 0
>> -dev.igc.1.iflib.txq3.ring_state: pidx_head: 0000 pidx_tail: 0000 =
cidx: 0000 state: IDLE
>> -dev.igc.1.iflib.txq3.txq_cleaned: 0
>> -dev.igc.1.iflib.txq3.txq_processed: 0
>> -dev.igc.1.iflib.txq3.txq_in_use: 0
>> -dev.igc.1.iflib.txq3.txq_cidx_processed: 0
>> -dev.igc.1.iflib.txq3.txq_cidx: 0
>> -dev.igc.1.iflib.txq3.txq_pidx: 0
>> +dev.igc.1.iflib.txq3.r_enqueues: 6175093
>> +dev.igc.1.iflib.txq3.ring_state: pidx_head: 0373 pidx_tail: 0373 =
cidx: 0373 state: IDLE
>> +dev.igc.1.iflib.txq3.txq_cleaned: 12350144
>> +dev.igc.1.iflib.txq3.txq_processed: 12350184
>> +dev.igc.1.iflib.txq3.txq_in_use: 42
>> +dev.igc.1.iflib.txq3.txq_cidx_processed: 744
>> +dev.igc.1.iflib.txq3.txq_cidx: 704
>> +dev.igc.1.iflib.txq3.txq_pidx: 746
>>  dev.igc.1.iflib.txq3.no_tx_dma_setup: 0
>>  dev.igc.1.iflib.txq3.txd_encap_efbig: 0
>>  dev.igc.1.iflib.txq3.tx_map_failed: 0
>>  dev.igc.1.iflib.txq2.r_abdications: 0
>>  dev.igc.1.iflib.txq2.r_restarts: 0
>>  dev.igc.1.iflib.txq2.r_stalls: 0
>> -dev.igc.1.iflib.txq2.r_starts: 0
>> +dev.igc.1.iflib.txq2.r_starts: 3421789
>>  dev.igc.1.iflib.txq2.r_drops: 0
>> -dev.igc.1.iflib.txq2.r_enqueues: 0
>> -dev.igc.1.iflib.txq2.ring_state: pidx_head: 0000 pidx_tail: 0000 =
cidx: 0000 state: IDLE
>> -dev.igc.1.iflib.txq2.txq_cleaned: 0
>> -dev.igc.1.iflib.txq2.txq_processed: 0
>> -dev.igc.1.iflib.txq2.txq_in_use: 0
>> -dev.igc.1.iflib.txq2.txq_cidx_processed: 0
>> -dev.igc.1.iflib.txq2.txq_cidx: 0
>> -dev.igc.1.iflib.txq2.txq_pidx: 0
>> +dev.igc.1.iflib.txq2.r_enqueues: 3421789
>> +dev.igc.1.iflib.txq2.ring_state: pidx_head: 1629 pidx_tail: 1629 =
cidx: 1629 state: IDLE
>> +dev.igc.1.iflib.txq2.txq_cleaned: 6843536
>> +dev.igc.1.iflib.txq2.txq_processed: 6843576
>> +dev.igc.1.iflib.txq2.txq_in_use: 42
>> +dev.igc.1.iflib.txq2.txq_cidx_processed: 184
>> +dev.igc.1.iflib.txq2.txq_cidx: 144
>> +dev.igc.1.iflib.txq2.txq_pidx: 186
>>  dev.igc.1.iflib.txq2.no_tx_dma_setup: 0
>>  dev.igc.1.iflib.txq2.txd_encap_efbig: 0
>>  dev.igc.1.iflib.txq2.tx_map_failed: 0
>>  dev.igc.1.iflib.txq1.r_abdications: 0
>>  dev.igc.1.iflib.txq1.r_restarts: 0
>>  dev.igc.1.iflib.txq1.r_stalls: 0
>> -dev.igc.1.iflib.txq1.r_starts: 0
>> +dev.igc.1.iflib.txq1.r_starts: 2734852
>>  dev.igc.1.iflib.txq1.r_drops: 0
>> -dev.igc.1.iflib.txq1.r_enqueues: 0
>> -dev.igc.1.iflib.txq1.ring_state: pidx_head: 0000 pidx_tail: 0000 =
cidx: 0000 state: IDLE
>> -dev.igc.1.iflib.txq1.txq_cleaned: 0
>> -dev.igc.1.iflib.txq1.txq_processed: 0
>> -dev.igc.1.iflib.txq1.txq_in_use: 0
>> -dev.igc.1.iflib.txq1.txq_cidx_processed: 0
>> -dev.igc.1.iflib.txq1.txq_cidx: 0
>> -dev.igc.1.iflib.txq1.txq_pidx: 0
>> +dev.igc.1.iflib.txq1.r_enqueues: 2734852
>> +dev.igc.1.iflib.txq1.ring_state: pidx_head: 0772 pidx_tail: 0772 =
cidx: 0772 state: IDLE
>> +dev.igc.1.iflib.txq1.txq_cleaned: 5469662
>> +dev.igc.1.iflib.txq1.txq_processed: 5469702
>> +dev.igc.1.iflib.txq1.txq_in_use: 42
>> +dev.igc.1.iflib.txq1.txq_cidx_processed: 518
>> +dev.igc.1.iflib.txq1.txq_cidx: 478
>> +dev.igc.1.iflib.txq1.txq_pidx: 520
>>  dev.igc.1.iflib.txq1.no_tx_dma_setup: 0
>>  dev.igc.1.iflib.txq1.txd_encap_efbig: 0
>>  dev.igc.1.iflib.txq1.tx_map_failed: 0
>>  dev.igc.1.iflib.txq0.r_abdications: 0
>>  dev.igc.1.iflib.txq0.r_restarts: 0
>>  dev.igc.1.iflib.txq0.r_stalls: 0
>> -dev.igc.1.iflib.txq0.r_starts: 24
>> +dev.igc.1.iflib.txq0.r_starts: 1857414
>>  dev.igc.1.iflib.txq0.r_drops: 0
>> -dev.igc.1.iflib.txq0.r_enqueues: 24
>> -dev.igc.1.iflib.txq0.ring_state: pidx_head: 0024 pidx_tail: 0024 =
cidx: 0024 state: IDLE
>> -dev.igc.1.iflib.txq0.txq_cleaned: 3
>> -dev.igc.1.iflib.txq0.txq_processed: 43
>> +dev.igc.1.iflib.txq0.r_enqueues: 1857414
>> +dev.igc.1.iflib.txq0.ring_state: pidx_head: 1926 pidx_tail: 1926 =
cidx: 1926 state: IDLE
>> +dev.igc.1.iflib.txq0.txq_cleaned: 3714783
>> +dev.igc.1.iflib.txq0.txq_processed: 3714823
>>  dev.igc.1.iflib.txq0.txq_in_use: 42
>> -dev.igc.1.iflib.txq0.txq_cidx_processed: 43
>> -dev.igc.1.iflib.txq0.txq_cidx: 3
>> -dev.igc.1.iflib.txq0.txq_pidx: 45
>> +dev.igc.1.iflib.txq0.txq_cidx_processed: 775
>> +dev.igc.1.iflib.txq0.txq_cidx: 735
>> +dev.igc.1.iflib.txq0.txq_pidx: 777
>>  dev.igc.1.iflib.txq0.no_tx_dma_setup: 0
>>  dev.igc.1.iflib.txq0.txd_encap_efbig: 0
>>  dev.igc.1.iflib.txq0.tx_map_failed: 0
>>  dev.igc.1.%desc: Intel(R) Ethernet Controller I225-V
>>=20
>> Interface is RUNNING and ACTIVE
>> igc1: TX Queue 0 ------
>> igc1: hw tdh =3D 777, hw tdt =3D 777
>> igc1: TX Queue 1 ------
>> igc1: hw tdh =3D 520, hw tdt =3D 520
>> igc1: TX Queue 2 ------
>> igc1: hw tdh =3D 186, hw tdt =3D 186
>> igc1: TX Queue 3 ------
>> igc1: hw tdh =3D 746, hw tdt =3D 746
>> igc1: RX Queue 0 ------
>> igc1: hw rdh =3D 1, hw rdt =3D 0
>> igc1: RX Queue 1 ------
>> igc1: hw rdh =3D 0, hw rdt =3D 128
>> igc1: RX Queue 2 ------
>> igc1: hw rdh =3D 0, hw rdt =3D 128
>> igc1: RX Queue 3 ------
>> igc1: hw rdh =3D 499, hw rdt =3D 498
>>=20
>>=20
>>=20
>=20




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7DA72BB5-F4F1-4AF8-AD1C-CF68908CF723>