Date: Wed, 15 Jan 2020 20:42:50 -0500 From: John Jasen <jjasen@gmail.com> To: Navdeep Parhar <nparhar@gmail.com>, FreeBSD Net <freebsd-net@freebsd.org> Subject: Re: unexplained latency, interrupt spikes and loss of throughput on FreeBSD router/firewall system Message-ID: <CAACLuR26nRHgwjutWVNy_vy8jdk=QNWvBaxRDDqGg3RUrdyUxg@mail.gmail.com> In-Reply-To: <aadc3f1b-e4b9-7413-369f-e084619d36ce@gmail.com> References: <CAACLuR0AYBSPajzmp9%2BaAK%2B02M6_pnai3b9s7jDbtXLvd1fGNw@mail.gmail.com> <aadc3f1b-e4b9-7413-369f-e084619d36ce@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Jan 15, 2020 at 5:24 PM Navdeep Parhar <nparhar@gmail.com> wrote: > On 1/15/20 6:55 AM, John Jasen wrote: > > Executive summary: > > > > Periodically, load will spike on network interrupts on one of our > > firewalls. Latency will quickly climb to the point that things are > > unresponsive, sessions will timeout, and bandwidth will plummet. > > Is this with 9000 MTU? Can you please post "netstat -m" from this > system? 25683/15822/41505 mbufs in use (current/cache/total) 8190/8340/16530/2038296 mbuf clusters in use (current/cache/total/max) 8190/8255 mbuf+clusters out of packet secondary zone in use (current/cache) 2576/293/2869/1019147 4k (page size) jumbo clusters in use (current/cache/total/max) 540546/1917/542463/10000000 9k jumbo clusters in use (current/cache/total/max) 0/0/0/169857 16k jumbo clusters in use (current/cache/total/max) 4898018K/39060K/4937079K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) 0/53561/0 requests for jumbo clusters denied (4k/9k/16k) 0 sendfile syscalls 0 sendfile syscalls completed without I/O request 0 requests for I/O initiated by sendfile 0 pages read by sendfile as part of a request 0 pages were valid at time of a sendfile request 0 pages were requested for read ahead by applications 0 pages were read ahead by sendfile 0 times sendfile encountered an already busy page 0 requests for sfbufs denied 0 requests for sfbufs delayed > Assuming this is 9000 MTU, try setting this in > /boot/loader.conf and reboot: > > hw.cxgbe.largest_rx_cluster=4096 > We're already there. > > > We do not see increases in ethernet pause frames, drops, errors, or > > anything else like that from the system. > > This part is strange. The incoming frames are either being dropped > (errors or overflows) or getting throttled via pause frames. I'd have > expected "netstat -dI <ifnet>" to show errors or drops or "sysctl dev.cc > dev.cxl | grep pause" to show some activity. Can you please double check? > After a prior event on a firewall cluster, I started pushing pause frames and netstat drops/errors to elasticsearch. They remained constant and unchanging during this incident. I checked again, and they're still a flat line. > > Regards, > Navdeep > Thanks! >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAACLuR26nRHgwjutWVNy_vy8jdk=QNWvBaxRDDqGg3RUrdyUxg>