From nobody Wed Mar 12 09:19:08 2025 X-Original-To: dev-commits-src-branches@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4ZCQ710ryNz5qfTK; Wed, 12 Mar 2025 09:19:09 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4ZCQ705rpVz47Jv; Wed, 12 Mar 2025 09:19:08 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1741771148; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=dM3sP9EVD/yvPhiTddTfriqolA8USnf0H/kXzxghUAw=; b=Wb9JQwNL2bN3Vfoo2zOmJ9I+laUHuQtYJZjtJmnAX1Cvqd3Q1pFXdQyyNZuaHQKU6C1uD8 ptg/aGqsLzgqsunoWTqqqvdP9Wcne33d2LbYEiE+dGfMOO4v0WG975IFYuor1nujY6kvj5 mUSvYvt/vrr0jwoKXssxa6RSqBv+c/sNmvVM7Tk7Lc8R4UIQ3Um4uniT2LS16PRvoYi7oS OtVYYD7N5WhTH0ROAsyeVIKl/KD5uUD7fpuCyegvT6z/0XPM35+cKKg3P+D80uz89dAlPz Ua2A1u6OKYnWuTjglmdeGQumWUWYBXlVs7+T2hSXAkGSCfQvXQPObkDnbbTuJQ== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1741771148; a=rsa-sha256; cv=none; b=CsSmpM6Ua1CeSQp0RNUW4IoUjmQmBeiyMrjzC4bK5dnklotU8i1Wf5y4UsYKK+p4243QLR aedFMTChn+AENhe87ZAaHb24l9ByF6OQx5iyj7h/a7rkxOndT94ubCXViCW18Y604+moCQ +XpveJfOVyXzOkmZJdlsDm23zYJSLpI/1GzVuWO0psPzX/QMG/F1mW9iAFRXGEjDlhTxnq /EL8rKB4X2IeFCqo6/Bh+tFEUCwUIpKxQ5DcfQEPwihfUbr6P545ihHD1crDBwolObcenn 2yjkjbRhBlsicK1hdGqBRS1FaWRatwYnb+KfX9yS+qOIvVZ0QLD5VZVPgi+tGw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1741771148; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=dM3sP9EVD/yvPhiTddTfriqolA8USnf0H/kXzxghUAw=; b=CrbZgcIj73yit2Ub1ld1s2c2rT1SSOkd8Knfc4fauJxru+Of40+MUkH+rJGV4V6cSo+ibO V5s30qQEt0WcrsFX0O0HUULqIL8dAwvsb4bzX5PKjbEMdByEtXgMYSqgH54HJmrUk0TDuQ ULrdq9LFflm4bgWJZpRSaebPD1wwjSH2V4vKBstVVD2LajhEJ2gNYj0X72Ly67/RkMp4ly T8KMWT51APZJM7N3e102LxgxskDrzvtO3qEZhh4N1vUaRk/FF3AvULfQGsZKLjLJOIyfmh ba5obo/zgljAHY7hmN9ntloZbQoCmqEavh+NuTUbf6jASVRHzZTbOGRFO23iYA== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4ZCQ705Grrz15RH; Wed, 12 Mar 2025 09:19:08 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 52C9J8Vv082688; Wed, 12 Mar 2025 09:19:08 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 52C9J8V3082685; Wed, 12 Mar 2025 09:19:08 GMT (envelope-from git) Date: Wed, 12 Mar 2025 09:19:08 GMT Message-Id: <202503120919.52C9J8V3082685@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: Wei Hu Subject: git: 5c97b7c296ac - stable/14 - mana: refill the rx mbuf in batch List-Id: Commits to the stable branches of the FreeBSD src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-branches List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-branches@freebsd.org Sender: owner-dev-commits-src-branches@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: whu X-Git-Repository: src X-Git-Refname: refs/heads/stable/14 X-Git-Reftype: branch X-Git-Commit: 5c97b7c296ac46d88bd4b9076c920c0b2e3d6dc2 Auto-Submitted: auto-generated The branch stable/14 has been updated by whu: URL: https://cgit.FreeBSD.org/src/commit/?id=5c97b7c296ac46d88bd4b9076c920c0b2e3d6dc2 commit 5c97b7c296ac46d88bd4b9076c920c0b2e3d6dc2 Author: Wei Hu AuthorDate: 2025-02-27 08:02:46 +0000 Commit: Wei Hu CommitDate: 2025-03-12 09:09:32 +0000 mana: refill the rx mbuf in batch Set the default refill threshod to be one quarter of the rx queue length. User can change this value with hw.mana.rx_refill_thresh in loader.conf. It improves the rx completion handling by saving 10% to 15% of overall time with this change. Tested by: whu MFC after: 2 weeks Sponsored by: Microsoft (cherry picked from commit 9b8701b81f14f0fa0787425eb9761b765d5faab0) --- sys/dev/mana/mana.h | 10 ++++ sys/dev/mana/mana_en.c | 127 ++++++++++++++++++++++++++++++++++----------- sys/dev/mana/mana_sysctl.c | 7 +++ 3 files changed, 114 insertions(+), 30 deletions(-) diff --git a/sys/dev/mana/mana.h b/sys/dev/mana/mana.h index a805aa047b9d..a037eb3f05c7 100644 --- a/sys/dev/mana/mana.h +++ b/sys/dev/mana/mana.h @@ -149,6 +149,7 @@ struct mana_stats { counter_u64_t collapse_err; /* tx */ counter_u64_t dma_mapping_err; /* rx, tx */ counter_u64_t mbuf_alloc_fail; /* rx */ + counter_u64_t partial_refill; /* rx */ counter_u64_t alt_chg; /* tx */ counter_u64_t alt_reset; /* tx */ counter_u64_t cqe_err; /* tx */ @@ -441,6 +442,8 @@ struct mana_rxq { uint32_t num_rx_buf; uint32_t buf_index; + uint32_t next_to_refill; + uint32_t refill_thresh; uint64_t lro_tried; uint64_t lro_failed; @@ -711,6 +714,13 @@ struct mana_cfg_rx_steer_resp { #define MANA_SHORT_VPORT_OFFSET_MAX ((1U << 8) - 1) +#define MANA_IDX_NEXT(idx, size) (((idx) + 1) & ((size) - 1)) +#define MANA_GET_SPACE(start_idx, end_idx, size) \ + (((end_idx) >= (start_idx)) ? \ + ((end_idx) - (start_idx)) : ((size) - (start_idx) + (end_idx))) + +#define MANA_RX_REFILL_THRESH 256 + struct mana_tx_package { struct gdma_wqe_request wqe_req; struct gdma_sge sgl_array[MAX_MBUF_FRAGS]; diff --git a/sys/dev/mana/mana_en.c b/sys/dev/mana/mana_en.c index 4734b34a9f3b..1df5419e6c64 100644 --- a/sys/dev/mana/mana_en.c +++ b/sys/dev/mana/mana_en.c @@ -69,6 +69,7 @@ static int mana_down(struct mana_port_context *apc); extern unsigned int mana_tx_req_size; extern unsigned int mana_rx_req_size; +extern unsigned int mana_rx_refill_threshold; static void mana_rss_key_fill(void *k, size_t size) @@ -638,8 +639,7 @@ mana_xmit(struct mana_txq *txq) continue; } - next_to_use = - (next_to_use + 1) % tx_queue_size; + next_to_use = MANA_IDX_NEXT(next_to_use, tx_queue_size); (void)atomic_inc_return(&txq->pending_sends); @@ -1527,7 +1527,7 @@ mana_poll_tx_cq(struct mana_cq *cq) mb(); next_to_complete = - (next_to_complete + 1) % tx_queue_size; + MANA_IDX_NEXT(next_to_complete, tx_queue_size); pkt_transmitted++; } @@ -1592,18 +1592,11 @@ mana_poll_tx_cq(struct mana_cq *cq) } static void -mana_post_pkt_rxq(struct mana_rxq *rxq) +mana_post_pkt_rxq(struct mana_rxq *rxq, + struct mana_recv_buf_oob *recv_buf_oob) { - struct mana_recv_buf_oob *recv_buf_oob; - uint32_t curr_index; int err; - curr_index = rxq->buf_index++; - if (rxq->buf_index == rxq->num_rx_buf) - rxq->buf_index = 0; - - recv_buf_oob = &rxq->rx_oobs[curr_index]; - err = mana_gd_post_work_request(rxq->gdma_rq, &recv_buf_oob->wqe_req, &recv_buf_oob->wqe_inf); if (err) { @@ -1722,6 +1715,68 @@ mana_rx_mbuf(struct mbuf *mbuf, struct mana_rxcomp_oob *cqe, counter_exit(); } +static int +mana_refill_rx_mbufs(struct mana_port_context *apc, + struct mana_rxq *rxq, uint32_t num) +{ + struct mana_recv_buf_oob *rxbuf_oob; + uint32_t next_to_refill; + uint32_t i; + int err; + + next_to_refill = rxq->next_to_refill; + + for (i = 0; i < num; i++) { + if (next_to_refill == rxq->buf_index) { + mana_warn(NULL, "refilling index reached current, " + "aborted! rxq %u, oob idx %u\n", + rxq->rxq_idx, next_to_refill); + break; + } + + rxbuf_oob = &rxq->rx_oobs[next_to_refill]; + + if (likely(rxbuf_oob->mbuf == NULL)) { + err = mana_load_rx_mbuf(apc, rxq, rxbuf_oob, true); + } else { + mana_warn(NULL, "mbuf not null when refilling, " + "rxq %u, oob idx %u, reusing\n", + rxq->rxq_idx, next_to_refill); + err = mana_load_rx_mbuf(apc, rxq, rxbuf_oob, false); + } + + if (unlikely(err != 0)) { + mana_dbg(NULL, + "failed to load rx mbuf, err = %d, rxq = %u\n", + err, rxq->rxq_idx); + counter_u64_add(rxq->stats.mbuf_alloc_fail, 1); + break; + } + + mana_post_pkt_rxq(rxq, rxbuf_oob); + + next_to_refill = MANA_IDX_NEXT(next_to_refill, + rxq->num_rx_buf); + } + + if (likely(i != 0)) { + struct gdma_context *gc = + rxq->gdma_rq->gdma_dev->gdma_context; + + mana_gd_wq_ring_doorbell(gc, rxq->gdma_rq); + } + + if (unlikely(i < num)) { + counter_u64_add(rxq->stats.partial_refill, 1); + mana_dbg(NULL, + "refilled rxq %u with only %u mbufs (%u requested)\n", + rxq->rxq_idx, i, num); + } + + rxq->next_to_refill = next_to_refill; + return (i); +} + static void mana_process_rx_cqe(struct mana_rxq *rxq, struct mana_cq *cq, struct gdma_comp *cqe) @@ -1731,8 +1786,8 @@ mana_process_rx_cqe(struct mana_rxq *rxq, struct mana_cq *cq, if_t ndev = rxq->ndev; struct mana_port_context *apc; struct mbuf *old_mbuf; + uint32_t refill_required; uint32_t curr, pktlen; - int err; switch (oob->cqe_hdr.cqe_type) { case CQE_RX_OKAY: @@ -1785,29 +1840,24 @@ mana_process_rx_cqe(struct mana_rxq *rxq, struct mana_cq *cq, /* Unload DMA map for the old mbuf */ mana_unload_rx_mbuf(apc, rxq, rxbuf_oob, false); - - /* Load a new mbuf to replace the old one */ - err = mana_load_rx_mbuf(apc, rxq, rxbuf_oob, true); - if (err) { - mana_dbg(NULL, - "failed to load rx mbuf, err = %d, packet dropped.\n", - err); - counter_u64_add(rxq->stats.mbuf_alloc_fail, 1); - /* - * Failed to load new mbuf, rxbuf_oob->mbuf is still - * pointing to the old one. Drop the packet. - */ - old_mbuf = NULL; - /* Reload the existing mbuf */ - mana_load_rx_mbuf(apc, rxq, rxbuf_oob, false); - } + /* Clear the mbuf pointer to avoid reuse */ + rxbuf_oob->mbuf = NULL; mana_rx_mbuf(old_mbuf, oob, rxq); drop: mana_move_wq_tail(rxq->gdma_rq, rxbuf_oob->wqe_inf.wqe_size_in_bu); - mana_post_pkt_rxq(rxq); + rxq->buf_index = MANA_IDX_NEXT(rxq->buf_index, rxq->num_rx_buf); + + /* Check if refill is needed */ + refill_required = MANA_GET_SPACE(rxq->next_to_refill, + rxq->buf_index, rxq->num_rx_buf); + + if (refill_required >= rxq->refill_thresh) { + /* Refill empty rx_oobs with new mbufs */ + mana_refill_rx_mbufs(apc, rxq, refill_required); + } } static void @@ -2349,6 +2399,23 @@ mana_create_rxq(struct mana_port_context *apc, uint32_t rxq_idx, mana_dbg(NULL, "Setting rxq %d datasize %d\n", rxq_idx, rxq->datasize); + /* + * Two steps to set the mbuf refill_thresh. + * 1) If mana_rx_refill_threshold is set, honor it. + * Set to default value otherwise. + * 2) Select the smaller of 1) above and 1/4 of the + * rx buffer size. + */ + if (mana_rx_refill_threshold != 0) + rxq->refill_thresh = mana_rx_refill_threshold; + else + rxq->refill_thresh = MANA_RX_REFILL_THRESH; + rxq->refill_thresh = min_t(uint32_t, + rxq->num_rx_buf / 4, rxq->refill_thresh); + + mana_dbg(NULL, "Setting rxq %d refill thresh %u\n", + rxq_idx, rxq->refill_thresh); + rxq->rxobj = INVALID_MANA_HANDLE; err = mana_alloc_rx_wqe(apc, rxq, &rq_size, &cq_size); diff --git a/sys/dev/mana/mana_sysctl.c b/sys/dev/mana/mana_sysctl.c index acb3628f09bc..c2916f9004cd 100644 --- a/sys/dev/mana/mana_sysctl.c +++ b/sys/dev/mana/mana_sysctl.c @@ -36,6 +36,7 @@ int mana_log_level = MANA_ALERT | MANA_WARNING | MANA_INFO; unsigned int mana_tx_req_size; unsigned int mana_rx_req_size; +unsigned int mana_rx_refill_threshold; SYSCTL_NODE(_hw, OID_AUTO, mana, CTLFLAG_RD | CTLFLAG_MPSAFE, 0, "MANA driver parameters"); @@ -44,6 +45,9 @@ SYSCTL_UINT(_hw_mana, OID_AUTO, tx_req_size, CTLFLAG_RWTUN, &mana_tx_req_size, 0, "requested number of unit of tx queue"); SYSCTL_UINT(_hw_mana, OID_AUTO, rx_req_size, CTLFLAG_RWTUN, &mana_rx_req_size, 0, "requested number of unit of rx queue"); +SYSCTL_UINT(_hw_mana, OID_AUTO, rx_refill_thresh, CTLFLAG_RWTUN, + &mana_rx_refill_threshold, 0, + "number of rx slots before starting the refill"); /* * Logging level for changing verbosity of the output @@ -329,6 +333,9 @@ mana_sysctl_add_queues(struct mana_port_context *apc) SYSCTL_ADD_COUNTER_U64(ctx, rx_list, OID_AUTO, "mbuf_alloc_fail", CTLFLAG_RD, &rx_stats->mbuf_alloc_fail, "Failed mbuf allocs"); + SYSCTL_ADD_COUNTER_U64(ctx, rx_list, OID_AUTO, + "partial_refill", CTLFLAG_RD, + &rx_stats->partial_refill, "Partially refilled mbuf"); SYSCTL_ADD_COUNTER_U64(ctx, rx_list, OID_AUTO, "dma_mapping_err", CTLFLAG_RD, &rx_stats->dma_mapping_err, "DMA mapping errors");