From nobody Sun Jan 19 01:24:57 2025 X-Original-To: freebsd-net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4YbG441JPLz5l8dQ for ; Sun, 19 Jan 2025 01:25:08 +0000 (UTC) (envelope-from tpearson@raptorengineering.com) Received: from raptorengineering.com (mail.raptorengineering.com [23.155.224.40]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4YbG3y4F3qz3jLn for ; Sun, 19 Jan 2025 01:25:02 +0000 (UTC) (envelope-from tpearson@raptorengineering.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=raptorengineering.com header.s=B8E824E6-0BE2-11E6-931D-288C65937AAD header.b=YKwP1hnO; spf=pass (mx1.freebsd.org: domain of tpearson@raptorengineering.com designates 23.155.224.40 as permitted sender) smtp.mailfrom=tpearson@raptorengineering.com; dmarc=pass (policy=quarantine) header.from=raptorengineering.com Received: from localhost (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id C11008286C28 for ; Sat, 18 Jan 2025 19:25:01 -0600 (CST) Received: from mail.rptsys.com ([127.0.0.1]) by localhost (vali.starlink.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id iWimkCtkAxlR for ; Sat, 18 Jan 2025 19:25:00 -0600 (CST) Received: from localhost (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id 725058286DE2 for ; Sat, 18 Jan 2025 19:25:00 -0600 (CST) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.rptsys.com 725058286DE2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raptorengineering.com; s=B8E824E6-0BE2-11E6-931D-288C65937AAD; t=1737249900; bh=Dghs0ScQHhofSCORp6CwyLGp+z3YqpWlbg7h/BLDp0k=; h=Date:From:To:Message-ID:MIME-Version; b=YKwP1hnOPwHtTPAAu2DB2tmBmNeynuaQXMBkRdFhhD6eSbobQKNBptJYsZEtbd9Vm MEhuwhwpvbMXdM5ACYjiix2x1bgPmu4zJOhMjSP0oUBjslINWDq8QSnsdQmFwIaF+Q B7UCMUvbyXeSv7Iwy9Bq2EkdIBdh6KzqWctRqAiQ= X-Virus-Scanned: amavisd-new at rptsys.com Received: from mail.rptsys.com ([127.0.0.1]) by localhost (vali.starlink.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id UQwC-TyMEVpb for ; Sat, 18 Jan 2025 19:25:00 -0600 (CST) Received: from vali.starlink.edu (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id 3FF688286C28 for ; Sat, 18 Jan 2025 19:25:00 -0600 (CST) Date: Sat, 18 Jan 2025 19:24:57 -0600 (CST) From: Timothy Pearson To: freebsd-net Message-ID: <1820780643.6432954.1737249897808.JavaMail.zimbra@raptorengineeringinc.com> In-Reply-To: <2079636793.6405429.1737238589082.JavaMail.zimbra@raptorengineeringinc.com> References: <2079636793.6405429.1737238589082.JavaMail.zimbra@raptorengineeringinc.com> Subject: Re: FreeBSD 13: IPSec netisr overload causes unrelated packet loss List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Mailer: Zimbra 8.5.0_GA_3042 (ZimbraWebClient - GC131 (Linux)/8.5.0_GA_3042) Thread-Topic: FreeBSD 13: IPSec netisr overload causes unrelated packet loss Thread-Index: tz6wbGB+79GMn53je0Ba0VCT6l8gp7bFeFT2 X-Spamd-Result: default: False [-3.98 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.98)[-0.976]; DMARC_POLICY_ALLOW(-0.50)[raptorengineering.com,quarantine]; R_DKIM_ALLOW(-0.20)[raptorengineering.com:s=B8E824E6-0BE2-11E6-931D-288C65937AAD]; R_SPF_ALLOW(-0.20)[+mx]; MIME_GOOD(-0.10)[text/plain]; ARC_NA(0.00)[]; ASN(0.00)[asn:46246, ipnet:23.155.224.0/24, country:US]; RECEIVED_HELO_LOCALHOST(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; MLMMJ_DEST(0.00)[freebsd-net@FreeBSD.org]; TO_DN_ALL(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; RCVD_COUNT_FIVE(0.00)[5]; PREVIOUSLY_DELIVERED(0.00)[freebsd-net@freebsd.org]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DKIM_TRACE(0.00)[raptorengineering.com:+] X-Spamd-Bar: --- X-Rspamd-Queue-Id: 4YbG3y4F3qz3jLn Quick update --tried the IPSec deferred update patch [1], no change. A few tunables I forgot to include as well: net.route.netisr_maxqlen: 256 net.isr.numthreads: 32 net.isr.maxprot: 16 net.isr.defaultqlimit: 256 net.isr.maxqlimit: 10240 net.isr.bindthreads: 1 net.isr.maxthreads: 32 net.isr.dispatch: direct [1] https://www.mail-archive.com/freebsd-net@freebsd.org/msg64742.html ----- Original Message ----- > From: "Timothy Pearson" > To: "freebsd-net" > Sent: Saturday, January 18, 2025 4:16:29 PM > Subject: FreeBSD 13: IPSec netisr overload causes unrelated packet loss > Hi all, > > I've been pulling my hair out over a rather interesting problem that I've traced > into an interaction between IPSec and the rest of the network stack. I'm not > sure if this is a bug or if there's a tunable I'm missing somewhere, so here > goes... > > We have a pf-based multi-CPU firewall running FreeBSD 13.x with multiple subnets > directly attached, one per NIC, as well as multiple IPSec tunnels to remote > sites alongside a UDP multicast proxy system (this becomes important later). > For the most part the setup works very well, however we have discovered > through extensive trial and error / debugging that we can induce major packet > loss on the firewall host itself by simply flooding the system with small IPSec > packets (high PPS, low bandwidth). > > The aforementioned (custom) multicast UDP proxy is an excellent canary for the > problem, as it checks for and reports any dropped packets in the receive data > stream. Normally, there are no dropped packets even with saturated links on > any of the local interfaces or when *sending* high packet rates over IPsec. As > soon as high packet rates are *received* over IPsec, the following happens: > > 1.) netisr on one core only goes to 100% interrupt load > 2.) net.inet.ip.intr_queue_drops starts incrementing rapidly > 3.) The multicast receiver, which only receives traffic from one of the *local* > interfaces (not any of the IPsec tunnels), begins to see packet loss despite > more than adequate buffers in place with no buffer overflows in the UDP stack / > application buffering. The packets are simply never received by the kernel UDP > stack. > 4.) Other applications (e.g. NTP) start to see sporadic packet loss as well, > again on local traffic not over IPsec. > > As soon as the IPSec receive traffic is lowered enough to get the netisr > interrupt load below 100% on the one CPU core, everything recovers and > functions normally. Note this has to be done by lowering the IPSec transmit > rate on the remote system, there is no way I have discovered to "protect" the > receiver from this kind of overload. > > While I would expect packet loss in an overloaded IPSec link scenario like this > just due to the decryption not keeping up, I would also expect that loss to be > confined to the IPSec tunnel. It should not spider out into the rest of the > system and start affecting all of the other applications and > routing/firewalling on the box -- this is what was miserable to debug, as the > IPSec link was originally only hitting the PPS limits described above > sporadically during overnight batch processing. Now that I know what's going > on, I can provoke easily with iperf3 in UDP mode. On the boxes we are using, > the limit seems to be around 50kPPS before we hit 100% netisr CPU load -- this > limit is *much* lower with async crypto turned off. > > Important tunables already set: > > net.inet.ipsec.async_crypto=1 (turning this off just makes the symptoms appear > at lower PPS rates) > net.isr.dispatch=direct (deferred or hybrid does nothing to change the symptoms) > net.inet.ip.intr_queue_maxlen=4096 > > Thoughts are welcome...if there's any way to stop the "spread" of the loss I'm > all ears. It seems that somehow the IPSec traffic (perhaps by nature of its > lengthy decryption process) is able to grab an unfair share of netisr queue 0, > and that interferes with the other traffic. If there was a way to move the > IPSec decryption to another netisr queue, that might fix the problem, but I > don't see any tunables to do so. > > Thanks!