Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 15 Jan 2020 09:55:41 -0500
From:      John Jasen <jjasen@gmail.com>
To:        FreeBSD Net <freebsd-net@freebsd.org>
Subject:   unexplained latency, interrupt spikes and loss of throughput on FreeBSD router/firewall system
Message-ID:  <CAACLuR0AYBSPajzmp9%2BaAK%2B02M6_pnai3b9s7jDbtXLvd1fGNw@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Executive summary:

Periodically, load will spike on network interrupts on one of our
firewalls. Latency will quickly climb to the point that things are
unresponsive, sessions will timeout, and bandwidth will plummet.

We do not see increases in ethernet pause frames, drops, errors, or
anything else like that from the system.

Usually, the quickest fix is to failover to the backup firewall. At that
time, the backup firewall behaves normally and interrupt load drops on the
afflicted firewall device.

I'm stumped. Networking says its these systems. I believe its something on
other side.

Any ideas?

Background information:
FreeBSD 11.3-RELEASE-p3
hw.machine: amd64
hw.model: Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz
hw.ncpu: 24
hw.machine_arch: amd64
Firewall: pf
failover: CARP
network cards: seen with Chelsio T5-580 and T6 series cards.
other networking information: VLANs are in use. Occasional LAGG usage as
well.

  When this occurs, some of the interrupts dedicated to cxgbe queues spike
to 100%.  Latency climbs to the point that TCP timeouts start kicking in,
and users start complaining. Bandwidth drops from 2-3Gbs to ~100-200Mbs

netstat shows no increase of error or drop packets. sysctl shows no
increase in pause frames.

I'm happy to provide further information.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAACLuR0AYBSPajzmp9%2BaAK%2B02M6_pnai3b9s7jDbtXLvd1fGNw>