Date: Fri, 24 Apr 2026 22:30:36 +0000 From: Colin Percival <cperciva@FreeBSD.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org Subject: git: 0f7b8f79f67b - main - ena: Budget rx descriptors, not packets Message-ID: <69ebef0c.428ac.36a81d7c@gitrepo.freebsd.org>
index | next in thread | raw e-mail
The branch main has been updated by cperciva: URL: https://cgit.FreeBSD.org/src/commit/?id=0f7b8f79f67b25cb0727c7b7d604eb1eec91fef1 commit 0f7b8f79f67b25cb0727c7b7d604eb1eec91fef1 Author: Colin Percival <cperciva@FreeBSD.org> AuthorDate: 2026-04-17 17:40:00 +0000 Commit: Colin Percival <cperciva@FreeBSD.org> CommitDate: 2026-04-24 22:30:13 +0000 ena: Budget rx descriptors, not packets We had ENA_RX_BUDGET = 256 in order to allow up to 256 received packets to be processed before we do other cleanups (handling tx packets and, critically, refilling the rx buffer ring). Since the ring holds 1024 buffers by default, this was fine for normal packets: We refill the ring when it falls below 7/8 full, and even with a large burst of incoming packets allowing it to fall by another 1/4 before we consider refilling the ring still leaves it at 7/8 - 1/4 = 5/8 full. With jumbos, the story is different: A 9k jumbo (as is used by default within the EC2 network) consumes 3 descriptors, so a single rx cleanup pass can consume 3/4 of the default-sized rx ring; if the rx buffer ring wasn't completely full before a packet burst arrives, this puts us perilously close to running out of rx buffers. This precise failure mode has been observed on some EC2 instance types within a Cluster Placement Group, resulting in the nominal 10 Gbps single-flow throughput between instances dropping to ~100 Mbps as a result of repeated rx overruns causing packet loss and ultimately retransmission timeouts. To correct this, switch from processing up to ENA_RX_BUDGET (256) packets to processing up to ENA_RX_DESC_BUDGET (256) descriptors (or slightly more, if we hit the limit in the middle of a packet). This ensures that, even with jumbos, we refill the ring before processing most of a ring worth of descriptors, and returns the throughput to expected levels. Note that theoretically up to ENA_PKT_MAX_BUFS (19) descriptors can be used for a single packet, in which case even 54 packets would exhaust the default rx buffer ring; it's not clear if this ever occurs in practice, but this fix will address that case as well. Reviewed by: akiyano Sponsored by: Amazon MFC after: 6 days Differential Revision: https://reviews.freebsd.org/D56479 --- sys/dev/ena/ena.h | 4 ++-- sys/dev/ena/ena_datapath.c | 13 ++++++++++--- 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/sys/dev/ena/ena.h b/sys/dev/ena/ena.h index f67c7002327d..b2156437f847 100644 --- a/sys/dev/ena/ena.h +++ b/sys/dev/ena/ena.h @@ -99,8 +99,8 @@ * of TCP retransmissions. */ #define ENA_TX_BUDGET 128 -/* RX cleanup budget. -1 stands for infinity. */ -#define ENA_RX_BUDGET 256 +/* RX cleanup budget, in descriptors. -1 stands for infinity. */ +#define ENA_RX_DESC_BUDGET 256 /* * How many times we can repeat cleanup in the io irq handling routine if the * RX or TX budget was depleted. diff --git a/sys/dev/ena/ena_datapath.c b/sys/dev/ena/ena_datapath.c index 57148d8ef81f..91e3e3b6e4cd 100644 --- a/sys/dev/ena/ena_datapath.c +++ b/sys/dev/ena/ena_datapath.c @@ -571,7 +571,7 @@ ena_rx_cleanup(struct ena_ring *rx_ring) uint32_t do_if_input = 0; unsigned int qid; int rc, i; - int budget = ENA_RX_BUDGET; + int budget = (ENA_RX_DESC_BUDGET == -1) ? INT_MAX : ENA_RX_DESC_BUDGET; #ifdef DEV_NETMAP int done; #endif /* DEV_NETMAP */ @@ -680,7 +680,14 @@ ena_rx_cleanup(struct ena_ring *rx_ring) counter_u64_add_protected(rx_ring->rx_stats.cnt, 1); counter_u64_add_protected(adapter->hw_stats.rx_packets, 1); counter_exit(); - } while (--budget); + + /* + * Adjust our budget; note that we count descriptors, not + * packets, since we need to ensure we don't run out of rx + * buffers when receiving jumbos. + */ + budget -= ena_rx_ctx.descs; + } while (budget > 0); rx_ring->next_to_clean = next_to_clean; @@ -695,7 +702,7 @@ ena_rx_cleanup(struct ena_ring *rx_ring) tcp_lro_flush_all(&rx_ring->lro); - return (budget == 0); + return (budget <= 0); } static voidhome | help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?69ebef0c.428ac.36a81d7c>
