Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 15 Jan 2020 20:42:50 -0500
From:      John Jasen <jjasen@gmail.com>
To:        Navdeep Parhar <nparhar@gmail.com>, FreeBSD Net <freebsd-net@freebsd.org>
Subject:   Re: unexplained latency, interrupt spikes and loss of throughput on FreeBSD router/firewall system
Message-ID:  <CAACLuR26nRHgwjutWVNy_vy8jdk=QNWvBaxRDDqGg3RUrdyUxg@mail.gmail.com>
In-Reply-To: <aadc3f1b-e4b9-7413-369f-e084619d36ce@gmail.com>
References:  <CAACLuR0AYBSPajzmp9%2BaAK%2B02M6_pnai3b9s7jDbtXLvd1fGNw@mail.gmail.com> <aadc3f1b-e4b9-7413-369f-e084619d36ce@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Jan 15, 2020 at 5:24 PM Navdeep Parhar <nparhar@gmail.com> wrote:

> On 1/15/20 6:55 AM, John Jasen wrote:
> > Executive summary:
> >
> > Periodically, load will spike on network interrupts on one of our
> > firewalls. Latency will quickly climb to the point that things are
> > unresponsive, sessions will timeout, and bandwidth will plummet.
>
> Is this with 9000 MTU?  Can you please post "netstat -m" from this
> system?


25683/15822/41505 mbufs in use (current/cache/total)
8190/8340/16530/2038296 mbuf clusters in use (current/cache/total/max)
8190/8255 mbuf+clusters out of packet secondary zone in use (current/cache)
2576/293/2869/1019147 4k (page size) jumbo clusters in use
(current/cache/total/max)
540546/1917/542463/10000000 9k jumbo clusters in use
(current/cache/total/max)
0/0/0/169857 16k jumbo clusters in use (current/cache/total/max)
4898018K/39060K/4937079K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
0/53561/0 requests for jumbo clusters denied (4k/9k/16k)
0 sendfile syscalls
0 sendfile syscalls completed without I/O request
0 requests for I/O initiated by sendfile
0 pages read by sendfile as part of a request
0 pages were valid at time of a sendfile request
0 pages were requested for read ahead by applications
0 pages were read ahead by sendfile
0 times sendfile encountered an already busy page
0 requests for sfbufs denied
0 requests for sfbufs delayed


> Assuming this is 9000 MTU, try setting this in
> /boot/loader.conf and reboot:
>
> hw.cxgbe.largest_rx_cluster=4096
>

We're already there.


>
> > We do not see increases in ethernet pause frames, drops, errors, or
> > anything else like that from the system.
>
> This part is strange.  The incoming frames are either being dropped
> (errors or overflows) or getting throttled via pause frames.  I'd have
> expected "netstat -dI <ifnet>" to show errors or drops or "sysctl dev.cc
> dev.cxl | grep pause" to show some activity.  Can you please double check?
>

After a prior event on a firewall cluster, I started pushing pause frames
and netstat drops/errors to elasticsearch. They remained constant and
unchanging during this incident. I checked again, and they're still a flat
line.



>
> Regards,
> Navdeep
>

Thanks!


>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAACLuR26nRHgwjutWVNy_vy8jdk=QNWvBaxRDDqGg3RUrdyUxg>