From nobody Sun Jan 19 01:46:47 2025 X-Original-To: freebsd-net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4YbGY95zYDz5lBcx for ; Sun, 19 Jan 2025 01:46:53 +0000 (UTC) (envelope-from tpearson@raptorengineering.com) Received: from raptorengineering.com (mail.raptorengineering.com [23.155.224.40]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4YbGY92c7sz3ltT for ; Sun, 19 Jan 2025 01:46:53 +0000 (UTC) (envelope-from tpearson@raptorengineering.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=raptorengineering.com header.s=B8E824E6-0BE2-11E6-931D-288C65937AAD header.b=NdOlXEtW; spf=pass (mx1.freebsd.org: domain of tpearson@raptorengineering.com designates 23.155.224.40 as permitted sender) smtp.mailfrom=tpearson@raptorengineering.com; dmarc=pass (policy=quarantine) header.from=raptorengineering.com Received: from localhost (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id 2806A82850F9 for ; Sat, 18 Jan 2025 19:46:52 -0600 (CST) Received: from mail.rptsys.com ([127.0.0.1]) by localhost (vali.starlink.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id X_IuvHvVUOiE for ; Sat, 18 Jan 2025 19:46:50 -0600 (CST) Received: from localhost (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id 9A67F828566C for ; Sat, 18 Jan 2025 19:46:50 -0600 (CST) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.rptsys.com 9A67F828566C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raptorengineering.com; s=B8E824E6-0BE2-11E6-931D-288C65937AAD; t=1737251210; bh=JICygY3B0tIKQgVtm022t0NhJmjj76en3GLE48C65X8=; h=Date:From:To:Message-ID:MIME-Version; b=NdOlXEtWfnkEvyo/4UqJTHIPGgksg2HqiwbizVxUHdBUi3UwgttaOXengso2citfk nrDq1qQiKkRSX4a+c7hHuBbMvlM77+LHb95XVp7aRJKHdSgSZRkc+FvM+6ex0aWbRn 4FwXPwv34A/UqrvHzTqq/Y9rZQpJmDz5EBD5u0Nk= X-Virus-Scanned: amavisd-new at rptsys.com Received: from mail.rptsys.com ([127.0.0.1]) by localhost (vali.starlink.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id xynUlTd7ImJD for ; Sat, 18 Jan 2025 19:46:50 -0600 (CST) Received: from vali.starlink.edu (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id 6209C82850F9 for ; Sat, 18 Jan 2025 19:46:50 -0600 (CST) Date: Sat, 18 Jan 2025 19:46:47 -0600 (CST) From: Timothy Pearson To: freebsd-net Message-ID: <715443646.6436953.1737251207686.JavaMail.zimbra@raptorengineeringinc.com> In-Reply-To: <1820780643.6432954.1737249897808.JavaMail.zimbra@raptorengineeringinc.com> References: <2079636793.6405429.1737238589082.JavaMail.zimbra@raptorengineeringinc.com> <1820780643.6432954.1737249897808.JavaMail.zimbra@raptorengineeringinc.com> Subject: Re: FreeBSD 13: IPSec netisr overload causes unrelated packet loss List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Mailer: Zimbra 8.5.0_GA_3042 (ZimbraWebClient - GC131 (Linux)/8.5.0_GA_3042) Thread-Topic: FreeBSD 13: IPSec netisr overload causes unrelated packet loss Thread-Index: tz6wbGB+79GMn53je0Ba0VCT6l8gp7bFeFT2egVOX8k= X-Spamd-Result: default: False [-4.00 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-0.998]; DMARC_POLICY_ALLOW(-0.50)[raptorengineering.com,quarantine]; R_DKIM_ALLOW(-0.20)[raptorengineering.com:s=B8E824E6-0BE2-11E6-931D-288C65937AAD]; R_SPF_ALLOW(-0.20)[+mx]; MIME_GOOD(-0.10)[text/plain]; RCPT_COUNT_ONE(0.00)[1]; ASN(0.00)[asn:46246, ipnet:23.155.224.0/24, country:US]; RECEIVED_HELO_LOCALHOST(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; MLMMJ_DEST(0.00)[freebsd-net@FreeBSD.org]; TO_DN_ALL(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; ARC_NA(0.00)[]; RCVD_COUNT_FIVE(0.00)[5]; PREVIOUSLY_DELIVERED(0.00)[freebsd-net@freebsd.org]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DKIM_TRACE(0.00)[raptorengineering.com:+] X-Spamd-Bar: --- X-Rspamd-Queue-Id: 4YbGY92c7sz3ltT Forcibly disabling RSS with the IPSec deferred patch seems to have fixed the issue. Given the wide ranging deleterious effects with RSS on vs. a bit of IPsec theoretical maximum bandwidth loss with it off, we'll take the bandwidth hit at the moment. ;) Are there any significant concerns with running the patch for deferred IPSec input? From my analysis of the code, I think the absolute worst case might be a reordered packet or two, but that's always possible with IPSec over UDP transport AFAIK. ----- Original Message ----- > From: "Timothy Pearson" > To: "freebsd-net" > Sent: Saturday, January 18, 2025 7:24:57 PM > Subject: Re: FreeBSD 13: IPSec netisr overload causes unrelated packet loss > Quick update --tried the IPSec deferred update patch [1], no change. > > A few tunables I forgot to include as well: > net.route.netisr_maxqlen: 256 > net.isr.numthreads: 32 > net.isr.maxprot: 16 > net.isr.defaultqlimit: 256 > net.isr.maxqlimit: 10240 > net.isr.bindthreads: 1 > net.isr.maxthreads: 32 > net.isr.dispatch: direct > > [1] https://www.mail-archive.com/freebsd-net@freebsd.org/msg64742.html > > ----- Original Message ----- >> From: "Timothy Pearson" >> To: "freebsd-net" >> Sent: Saturday, January 18, 2025 4:16:29 PM >> Subject: FreeBSD 13: IPSec netisr overload causes unrelated packet loss > >> Hi all, >> >> I've been pulling my hair out over a rather interesting problem that I've traced >> into an interaction between IPSec and the rest of the network stack. I'm not >> sure if this is a bug or if there's a tunable I'm missing somewhere, so here >> goes... >> >> We have a pf-based multi-CPU firewall running FreeBSD 13.x with multiple subnets >> directly attached, one per NIC, as well as multiple IPSec tunnels to remote >> sites alongside a UDP multicast proxy system (this becomes important later). >> For the most part the setup works very well, however we have discovered >> through extensive trial and error / debugging that we can induce major packet >> loss on the firewall host itself by simply flooding the system with small IPSec >> packets (high PPS, low bandwidth). >> >> The aforementioned (custom) multicast UDP proxy is an excellent canary for the >> problem, as it checks for and reports any dropped packets in the receive data >> stream. Normally, there are no dropped packets even with saturated links on >> any of the local interfaces or when *sending* high packet rates over IPsec. As >> soon as high packet rates are *received* over IPsec, the following happens: >> >> 1.) netisr on one core only goes to 100% interrupt load >> 2.) net.inet.ip.intr_queue_drops starts incrementing rapidly >> 3.) The multicast receiver, which only receives traffic from one of the *local* >> interfaces (not any of the IPsec tunnels), begins to see packet loss despite >> more than adequate buffers in place with no buffer overflows in the UDP stack / >> application buffering. The packets are simply never received by the kernel UDP >> stack. >> 4.) Other applications (e.g. NTP) start to see sporadic packet loss as well, >> again on local traffic not over IPsec. >> >> As soon as the IPSec receive traffic is lowered enough to get the netisr >> interrupt load below 100% on the one CPU core, everything recovers and >> functions normally. Note this has to be done by lowering the IPSec transmit >> rate on the remote system, there is no way I have discovered to "protect" the >> receiver from this kind of overload. >> >> While I would expect packet loss in an overloaded IPSec link scenario like this >> just due to the decryption not keeping up, I would also expect that loss to be >> confined to the IPSec tunnel. It should not spider out into the rest of the >> system and start affecting all of the other applications and >> routing/firewalling on the box -- this is what was miserable to debug, as the >> IPSec link was originally only hitting the PPS limits described above >> sporadically during overnight batch processing. Now that I know what's going >> on, I can provoke easily with iperf3 in UDP mode. On the boxes we are using, >> the limit seems to be around 50kPPS before we hit 100% netisr CPU load -- this >> limit is *much* lower with async crypto turned off. >> >> Important tunables already set: >> >> net.inet.ipsec.async_crypto=1 (turning this off just makes the symptoms appear >> at lower PPS rates) >> net.isr.dispatch=direct (deferred or hybrid does nothing to change the symptoms) >> net.inet.ip.intr_queue_maxlen=4096 >> >> Thoughts are welcome...if there's any way to stop the "spread" of the loss I'm >> all ears. It seems that somehow the IPSec traffic (perhaps by nature of its >> lengthy decryption process) is able to grab an unfair share of netisr queue 0, >> and that interferes with the other traffic. If there was a way to move the >> IPSec decryption to another netisr queue, that might fix the problem, but I >> don't see any tunables to do so. >> > > Thanks!