From nobody Mon Sep 29 15:22:29 2025 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4cb4gV2r40z68NYn; Mon, 29 Sep 2025 15:22:30 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R12" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4cb4gV01bZz46Bj; Mon, 29 Sep 2025 15:22:29 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1759159350; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=XSdS0bouAS/Sw3zvuszLEQiQ65HtiolEQa9MS6B1bRY=; b=C/W6bvqa1g1SE8YEBrSbHbL2XXkW0HoX0um10JMgQTUo64vikOzhvvDXAJtxNjXti+qeoK zM5psDgEv7rSEi9ZzZm+xRAWC2wd/vxBFV+bk1jWhmu5SFysdhtXObsrW1dZ3TNkR1ZaRV 2xXxT+6VKe2tl2Iz/Its0eD5PAS0sVn13HyBwSoI/a3azuWTHV5ZNknGWrGpztzVMWDHq+ 1cj262zGehZImZcOXeJeedJ2c4jnMrDQJACqqXtMAABsrpf3d17ve6k9UFcUZTMhRBwVnA 14nXii8IvDEPG/6GVokbp+iD79yaTvn0tNkvMCkNZs0Rl+qysast8WLsvZPfNQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1759159350; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=XSdS0bouAS/Sw3zvuszLEQiQ65HtiolEQa9MS6B1bRY=; b=k/pz40gHmY+mt6lDTFHZtiCntfrqPwdFedNrGHY55X0EliEXnK53zGa1wy1JWAEqUBHSJm 08+h34gGwOwzmHO7Dnl5CW451VyCHyaEmErqNhptYCUxidt0799xPudLFxpFMUk/YkiKKr 1Tg7wdXi5rWdUh2xqFcJ41kfqS/laKVJJRUt9JVsy5DmI5RGJs6pLfobn/xviD5cV0/A37 gplWLlPHwwqsIvfjNTpCeh3S+CqxrHIaVgv+f/8Iay9BiBUHZT9eIcgBSVxEZfT32ytm0E KfKoV7HsMg/nD7BtgUTNcCHSlVDXyHyEVDCKcB2dBfU8gG488V3JaYzmEn+NHw== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1759159350; a=rsa-sha256; cv=none; b=OD9Q9OXgigvplmZA7p3l6JHX/Q3H/sjmrTsSFeXH3GvH32i7XQ76HwJYSBcSf+hACk1wep EPMBxevYnoZ4CSFfD2D57rVe9KBZy6aBNa5XBy0OPXCFpafTuKIIShERn7+fpH7dnEI+3g LRzlZ5G7kvznjMy0JN/SfU6MTzZEqZE8lbSc66rQCYx8MLiVXfuFjs8ivbiW4usucHSS+R FwNdw2NKu2VA6ptFTQxMmpODwDH4DLy8l50pcOCDPMoEmQt+xI8gwCBbiH52yaIDGxRMjm mgp1me4sA14fA2z92kAxWDXLVt4vpoHsyXgrAz1qLstMXhrbHzunuISzw/BtjQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4cb4gT2zTfz18TK; Mon, 29 Sep 2025 15:22:29 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 58TFMTVi011839; Mon, 29 Sep 2025 15:22:29 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 58TFMTuf011836; Mon, 29 Sep 2025 15:22:29 GMT (envelope-from git) Date: Mon, 29 Sep 2025 15:22:29 GMT Message-Id: <202509291522.58TFMTuf011836@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Navdeep Parhar Subject: git: 19d9a9b15178 - main - cxgbe: Move the STAG and PBL memory pool arenas to the base driver List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-main@freebsd.org Sender: owner-dev-commits-src-main@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: np X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 19d9a9b15178ed7cfe3f463f43e28cce13fc4f94 Auto-Submitted: auto-generated The branch main has been updated by np: URL: https://cgit.FreeBSD.org/src/commit/?id=19d9a9b15178ed7cfe3f463f43e28cce13fc4f94 commit 19d9a9b15178ed7cfe3f463f43e28cce13fc4f94 Author: John Baldwin AuthorDate: 2025-09-29 14:55:16 +0000 Commit: Navdeep Parhar CommitDate: 2025-09-29 15:19:11 +0000 cxgbe: Move the STAG and PBL memory pool arenas to the base driver Both RDMA (iw_cxgbe) and NVMe offloads use TPT table entries to map transaction tags in incoming PDUs to buffers in host memory permitting direct placement of received data into host memory buffers avoiding copies (iSCSI offload uses a different scheme for mapping tags to host memory). Move the vmem arenas for the supporting card memory regions from iw_cxgbe to the main driver so they can be shared with the NVMe offload driver. In addition, add some helper routines for constructing work requests to update TPT table entries. MFC after: 3 days Sponsored by: Chelsio Communications --- sys/conf/files | 2 + sys/dev/cxgbe/adapter.h | 24 +++++ sys/dev/cxgbe/iw_cxgbe/device.c | 20 ++-- sys/dev/cxgbe/iw_cxgbe/iw_cxgbe.h | 5 +- sys/dev/cxgbe/iw_cxgbe/mem.c | 110 ++++---------------- sys/dev/cxgbe/iw_cxgbe/resource.c | 38 +------ sys/dev/cxgbe/iw_cxgbe/t4.h | 1 - sys/dev/cxgbe/t4_main.c | 2 + sys/dev/cxgbe/t4_tpt.c | 193 ++++++++++++++++++++++++++++++++++++ sys/dev/cxgbe/tom/t4_tom.h | 6 ++ sys/modules/cxgbe/if_cxgbe/Makefile | 1 + 11 files changed, 261 insertions(+), 141 deletions(-) diff --git a/sys/conf/files b/sys/conf/files index 63bf5c3fd724..6da1f7e97973 100644 --- a/sys/conf/files +++ b/sys/conf/files @@ -1393,6 +1393,8 @@ dev/cxgbe/t4_smt.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/t4_l2t.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" +dev/cxgbe/t4_tpt.c optional cxgbe pci \ + compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/t4_tracer.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/t4_vf.c optional cxgbev pci \ diff --git a/sys/dev/cxgbe/adapter.h b/sys/dev/cxgbe/adapter.h index e3906f8058a7..ac8cdddd41e5 100644 --- a/sys/dev/cxgbe/adapter.h +++ b/sys/dev/cxgbe/adapter.h @@ -971,6 +971,9 @@ struct adapter { vmem_t *key_map; struct tls_tunables tlst; + vmem_t *pbl_arena; + vmem_t *stag_arena; + uint8_t doorbells; int offload_map; /* port_id's with IFCAP_TOE enabled */ int bt_map; /* hw_port's that are BASE-T */ @@ -1549,6 +1552,27 @@ int t4_hashfilter_tcb_rpl(struct sge_iq *, const struct rss_header *, struct mbu int t4_del_hashfilter_rpl(struct sge_iq *, const struct rss_header *, struct mbuf *); void free_hftid_hash(struct tid_info *); +/* t4_tpt.c */ +#define T4_STAG_UNSET 0xffffffff +#define T4_WRITE_MEM_DMA_LEN \ + roundup2(sizeof(struct ulp_mem_io) + sizeof(struct ulptx_sgl), 16) +#define T4_ULPTX_MIN_IO 32 +#define T4_MAX_INLINE_SIZE 96 +#define T4_WRITE_MEM_INLINE_LEN(len) \ + roundup2(sizeof(struct ulp_mem_io) + sizeof(struct ulptx_idata) + \ + roundup((len), T4_ULPTX_MIN_IO), 16) + +uint32_t t4_pblpool_alloc(struct adapter *, int); +void t4_pblpool_free(struct adapter *, uint32_t, int); +uint32_t t4_stag_alloc(struct adapter *, int); +void t4_stag_free(struct adapter *, uint32_t, int); +void t4_init_tpt(struct adapter *); +void t4_free_tpt(struct adapter *); +void t4_write_mem_dma_wr(struct adapter *, void *, int, int, uint32_t, + uint32_t, vm_paddr_t, uint64_t); +void t4_write_mem_inline_wr(struct adapter *, void *, int, int, uint32_t, + uint32_t, void *, uint64_t); + static inline struct wrqe * alloc_wrqe(int wr_len, struct sge_wrq *wrq) { diff --git a/sys/dev/cxgbe/iw_cxgbe/device.c b/sys/dev/cxgbe/iw_cxgbe/device.c index 3c4d269f6c69..4610f91e96ac 100644 --- a/sys/dev/cxgbe/iw_cxgbe/device.c +++ b/sys/dev/cxgbe/iw_cxgbe/device.c @@ -132,26 +132,21 @@ c4iw_rdev_open(struct c4iw_rdev *rdev) rdev->stats.rqt.total = sc->vres.rq.size; rdev->stats.qid.total = sc->vres.qp.size; - rc = c4iw_init_resource(rdev, c4iw_num_stags(rdev), T4_MAX_NUM_PD); + rc = c4iw_init_resource(rdev, T4_MAX_NUM_PD); if (rc) { device_printf(sc->dev, "error %d initializing resources\n", rc); goto err1; } - rc = c4iw_pblpool_create(rdev); - if (rc) { - device_printf(sc->dev, "error %d initializing pbl pool\n", rc); - goto err2; - } rc = c4iw_rqtpool_create(rdev); if (rc) { device_printf(sc->dev, "error %d initializing rqt pool\n", rc); - goto err3; + goto err2; } rdev->status_page = (struct t4_dev_status_page *) __get_free_page(GFP_KERNEL); if (!rdev->status_page) { rc = -ENOMEM; - goto err4; + goto err3; } rdev->status_page->qp_start = sc->vres.qp.start; rdev->status_page->qp_size = sc->vres.qp.size; @@ -168,15 +163,13 @@ c4iw_rdev_open(struct c4iw_rdev *rdev) rdev->free_workq = create_singlethread_workqueue("iw_cxgb4_free"); if (!rdev->free_workq) { rc = -ENOMEM; - goto err5; + goto err4; } return (0); -err5: - free_page((unsigned long)rdev->status_page); err4: - c4iw_rqtpool_destroy(rdev); + free_page((unsigned long)rdev->status_page); err3: - c4iw_pblpool_destroy(rdev); + c4iw_rqtpool_destroy(rdev); err2: c4iw_destroy_resource(&rdev->resource); err1: @@ -186,7 +179,6 @@ err1: static void c4iw_rdev_close(struct c4iw_rdev *rdev) { free_page((unsigned long)rdev->status_page); - c4iw_pblpool_destroy(rdev); c4iw_rqtpool_destroy(rdev); c4iw_destroy_resource(&rdev->resource); } diff --git a/sys/dev/cxgbe/iw_cxgbe/iw_cxgbe.h b/sys/dev/cxgbe/iw_cxgbe/iw_cxgbe.h index ca2595b65b02..47ce10562c66 100644 --- a/sys/dev/cxgbe/iw_cxgbe/iw_cxgbe.h +++ b/sys/dev/cxgbe/iw_cxgbe/iw_cxgbe.h @@ -99,7 +99,6 @@ struct c4iw_id_table { }; struct c4iw_resource { - struct c4iw_id_table tpt_table; struct c4iw_id_table qid_table; struct c4iw_id_table pdid_table; }; @@ -904,11 +903,9 @@ int c4iw_ep_redirect(void *ctx, struct dst_entry *old, struct dst_entry *new, struct l2t_entry *l2t); u32 c4iw_get_resource(struct c4iw_id_table *id_table); void c4iw_put_resource(struct c4iw_id_table *id_table, u32 entry); -int c4iw_init_resource(struct c4iw_rdev *rdev, u32 nr_tpt, u32 nr_pdid); +int c4iw_init_resource(struct c4iw_rdev *rdev, u32 nr_pdid); int c4iw_init_ctrl_qp(struct c4iw_rdev *rdev); -int c4iw_pblpool_create(struct c4iw_rdev *rdev); int c4iw_rqtpool_create(struct c4iw_rdev *rdev); -void c4iw_pblpool_destroy(struct c4iw_rdev *rdev); void c4iw_rqtpool_destroy(struct c4iw_rdev *rdev); void c4iw_destroy_resource(struct c4iw_resource *rscp); int c4iw_destroy_ctrl_qp(struct c4iw_rdev *rdev); diff --git a/sys/dev/cxgbe/iw_cxgbe/mem.c b/sys/dev/cxgbe/iw_cxgbe/mem.c index 9e879bde6169..ae0aa0edc17a 100644 --- a/sys/dev/cxgbe/iw_cxgbe/mem.c +++ b/sys/dev/cxgbe/iw_cxgbe/mem.c @@ -56,49 +56,23 @@ mr_exceeds_hw_limits(struct c4iw_dev *dev, u64 length) static int _c4iw_write_mem_dma_aligned(struct c4iw_rdev *rdev, u32 addr, u32 len, - void *data, int wait) + dma_addr_t data, int wait) { struct adapter *sc = rdev->adap; - struct ulp_mem_io *ulpmc; - struct ulptx_sgl *sgl; u8 wr_len; int ret = 0; struct c4iw_wr_wait wr_wait; struct wrqe *wr; - addr &= 0x7FFFFFF; - if (wait) c4iw_init_wr_wait(&wr_wait); - wr_len = roundup(sizeof *ulpmc + sizeof *sgl, 16); + wr_len = T4_WRITE_MEM_DMA_LEN; wr = alloc_wrqe(wr_len, &sc->sge.ctrlq[0]); if (wr == NULL) return -ENOMEM; - ulpmc = wrtod(wr); - - memset(ulpmc, 0, wr_len); - INIT_ULPTX_WR(ulpmc, wr_len, 0, 0); - ulpmc->wr.wr_hi = cpu_to_be32(V_FW_WR_OP(FW_ULPTX_WR) | - (wait ? F_FW_WR_COMPL : 0)); - ulpmc->wr.wr_lo = wait ? (u64)(unsigned long)&wr_wait : 0; - ulpmc->wr.wr_mid = cpu_to_be32(V_FW_WR_LEN16(DIV_ROUND_UP(wr_len, 16))); - ulpmc->cmd = cpu_to_be32(V_ULPTX_CMD(ULP_TX_MEM_WRITE) | - V_T5_ULP_MEMIO_ORDER(1) | - V_T5_ULP_MEMIO_FID(sc->sge.ofld_rxq[0].iq.abs_id)); - if (chip_id(sc) >= CHELSIO_T7) - ulpmc->dlen = cpu_to_be32(V_T7_ULP_MEMIO_DATA_LEN(len>>5)); - else - ulpmc->dlen = cpu_to_be32(V_ULP_MEMIO_DATA_LEN(len>>5)); - ulpmc->len16 = cpu_to_be32(DIV_ROUND_UP(wr_len-sizeof(ulpmc->wr), 16)); - ulpmc->lock_addr = cpu_to_be32(V_ULP_MEMIO_ADDR(addr)); - - sgl = (struct ulptx_sgl *)(ulpmc + 1); - sgl->cmd_nsge = cpu_to_be32(V_ULPTX_CMD(ULP_TX_SC_DSGL) | - V_ULPTX_NSGE(1)); - sgl->len0 = cpu_to_be32(len); - sgl->addr0 = cpu_to_be64((u64)data); - + t4_write_mem_dma_wr(sc, wrtod(wr), wr_len, 0, addr, len, data, + wait ? (u64)(unsigned long)&wr_wait : 0); t4_wrq_tx(sc, wr); if (wait) @@ -111,74 +85,32 @@ static int _c4iw_write_mem_inline(struct c4iw_rdev *rdev, u32 addr, u32 len, void *data) { struct adapter *sc = rdev->adap; - struct ulp_mem_io *ulpmc; - struct ulptx_idata *ulpsc; - u8 wr_len, *to_dp, *from_dp; + u8 wr_len, *from_dp; int copy_len, num_wqe, i, ret = 0; struct c4iw_wr_wait wr_wait; struct wrqe *wr; - u32 cmd; - - cmd = cpu_to_be32(V_ULPTX_CMD(ULP_TX_MEM_WRITE)); - cmd |= cpu_to_be32(F_T5_ULP_MEMIO_IMM); - - addr &= 0x7FFFFFF; CTR3(KTR_IW_CXGBE, "%s addr 0x%x len %u", __func__, addr, len); - num_wqe = DIV_ROUND_UP(len, C4IW_MAX_INLINE_SIZE); c4iw_init_wr_wait(&wr_wait); + num_wqe = DIV_ROUND_UP(len, T4_MAX_INLINE_SIZE); + from_dp = data; for (i = 0; i < num_wqe; i++) { - - copy_len = min(len, C4IW_MAX_INLINE_SIZE); - wr_len = roundup(sizeof *ulpmc + sizeof *ulpsc + - roundup(copy_len, T4_ULPTX_MIN_IO), 16); + copy_len = min(len, T4_MAX_INLINE_SIZE); + wr_len = T4_WRITE_MEM_INLINE_LEN(copy_len); wr = alloc_wrqe(wr_len, &sc->sge.ctrlq[0]); if (wr == NULL) return -ENOMEM; - ulpmc = wrtod(wr); - - memset(ulpmc, 0, wr_len); - INIT_ULPTX_WR(ulpmc, wr_len, 0, 0); - - if (i == (num_wqe-1)) { - ulpmc->wr.wr_hi = cpu_to_be32(V_FW_WR_OP(FW_ULPTX_WR) | - F_FW_WR_COMPL); - ulpmc->wr.wr_lo = - (__force __be64)(unsigned long) &wr_wait; - } else - ulpmc->wr.wr_hi = cpu_to_be32(V_FW_WR_OP(FW_ULPTX_WR)); - ulpmc->wr.wr_mid = cpu_to_be32( - V_FW_WR_LEN16(DIV_ROUND_UP(wr_len, 16))); - - ulpmc->cmd = cmd; - if (chip_id(sc) >= CHELSIO_T7) - ulpmc->dlen = cpu_to_be32(V_T7_ULP_MEMIO_DATA_LEN( - DIV_ROUND_UP(copy_len, T4_ULPTX_MIN_IO))); - else - ulpmc->dlen = cpu_to_be32(V_ULP_MEMIO_DATA_LEN( - DIV_ROUND_UP(copy_len, T4_ULPTX_MIN_IO))); - ulpmc->len16 = cpu_to_be32(DIV_ROUND_UP(wr_len-sizeof(ulpmc->wr), - 16)); - ulpmc->lock_addr = cpu_to_be32(V_ULP_MEMIO_ADDR(addr + i * 3)); - - ulpsc = (struct ulptx_idata *)(ulpmc + 1); - ulpsc->cmd_more = cpu_to_be32(V_ULPTX_CMD(ULP_TX_SC_IMM)); - ulpsc->len = cpu_to_be32(roundup(copy_len, T4_ULPTX_MIN_IO)); - - to_dp = (u8 *)(ulpsc + 1); - from_dp = (u8 *)data + i * C4IW_MAX_INLINE_SIZE; - if (data) - memcpy(to_dp, from_dp, copy_len); - else - memset(to_dp, 0, copy_len); - if (copy_len % T4_ULPTX_MIN_IO) - memset(to_dp + copy_len, 0, T4_ULPTX_MIN_IO - - (copy_len % T4_ULPTX_MIN_IO)); + t4_write_mem_inline_wr(sc, wrtod(wr), wr_len, 0, addr, copy_len, + from_dp, i == (num_wqe - 1) ? + (__force __be64)(unsigned long) &wr_wait : 0); t4_wrq_tx(sc, wr); - len -= C4IW_MAX_INLINE_SIZE; - } + if (from_dp != NULL) + from_dp += T4_MAX_INLINE_SIZE; + addr += T4_MAX_INLINE_SIZE >> 5; + len -= T4_MAX_INLINE_SIZE; + } ret = c4iw_wait_for_reply(rdev, &wr_wait, 0, 0, NULL, __func__); return ret; } @@ -208,7 +140,7 @@ _c4iw_write_mem_dma(struct c4iw_rdev *rdev, u32 addr, u32 len, void *data) dmalen = T4_ULPTX_MAX_DMA; remain -= dmalen; ret = _c4iw_write_mem_dma_aligned(rdev, addr, dmalen, - (void *)daddr, !remain); + daddr, !remain); if (ret) goto out; addr += dmalen >> 5; @@ -270,8 +202,8 @@ static int write_tpt_entry(struct c4iw_rdev *rdev, u32 reset_tpt_entry, stag_idx = (*stag) >> 8; if ((!reset_tpt_entry) && (*stag == T4_STAG_UNSET)) { - stag_idx = c4iw_get_resource(&rdev->resource.tpt_table); - if (!stag_idx) { + stag_idx = t4_stag_alloc(rdev->adap, 1); + if (stag_idx == T4_STAG_UNSET) { mutex_lock(&rdev->stats.lock); rdev->stats.stag.fail++; mutex_unlock(&rdev->stats.lock); @@ -316,7 +248,7 @@ static int write_tpt_entry(struct c4iw_rdev *rdev, u32 reset_tpt_entry, sizeof(tpt), &tpt); if (reset_tpt_entry) { - c4iw_put_resource(&rdev->resource.tpt_table, stag_idx); + t4_stag_free(rdev->adap, stag_idx, 1); mutex_lock(&rdev->stats.lock); rdev->stats.stag.cur -= 32; mutex_unlock(&rdev->stats.lock); diff --git a/sys/dev/cxgbe/iw_cxgbe/resource.c b/sys/dev/cxgbe/iw_cxgbe/resource.c index 644ea0c631bf..cd20f1eafdd6 100644 --- a/sys/dev/cxgbe/iw_cxgbe/resource.c +++ b/sys/dev/cxgbe/iw_cxgbe/resource.c @@ -59,13 +59,9 @@ static int c4iw_init_qid_table(struct c4iw_rdev *rdev) } /* nr_* must be power of 2 */ -int c4iw_init_resource(struct c4iw_rdev *rdev, u32 nr_tpt, u32 nr_pdid) +int c4iw_init_resource(struct c4iw_rdev *rdev, u32 nr_pdid) { int err = 0; - err = c4iw_id_table_alloc(&rdev->resource.tpt_table, 0, nr_tpt, 1, - C4IW_ID_TABLE_F_RANDOM); - if (err) - goto tpt_err; err = c4iw_init_qid_table(rdev); if (err) goto qid_err; @@ -77,8 +73,6 @@ int c4iw_init_resource(struct c4iw_rdev *rdev, u32 nr_tpt, u32 nr_pdid) pdid_err: c4iw_id_table_free(&rdev->resource.qid_table); qid_err: - c4iw_id_table_free(&rdev->resource.tpt_table); - tpt_err: return -ENOMEM; } @@ -243,7 +237,6 @@ void c4iw_put_qpid(struct c4iw_rdev *rdev, u32 qid, void c4iw_destroy_resource(struct c4iw_resource *rscp) { - c4iw_id_table_free(&rscp->tpt_table); c4iw_id_table_free(&rscp->qid_table); c4iw_id_table_free(&rscp->pdid_table); } @@ -254,12 +247,9 @@ void c4iw_destroy_resource(struct c4iw_resource *rscp) u32 c4iw_pblpool_alloc(struct c4iw_rdev *rdev, int size) { - unsigned long addr; + u32 addr; - vmem_xalloc(rdev->pbl_arena, roundup(size, (1 << MIN_PBL_SHIFT)), - 4, 0, 0, VMEM_ADDR_MIN, VMEM_ADDR_MAX, - M_FIRSTFIT|M_NOWAIT, &addr); - CTR3(KTR_IW_CXGBE, "%s addr 0x%x size %d", __func__, (u32)addr, size); + addr = t4_pblpool_alloc(rdev->adap, size); mutex_lock(&rdev->stats.lock); if (addr) { rdev->stats.pbl.cur += roundup(size, 1 << MIN_PBL_SHIFT); @@ -268,33 +258,15 @@ u32 c4iw_pblpool_alloc(struct c4iw_rdev *rdev, int size) } else rdev->stats.pbl.fail++; mutex_unlock(&rdev->stats.lock); - return (u32)addr; + return addr; } void c4iw_pblpool_free(struct c4iw_rdev *rdev, u32 addr, int size) { - CTR3(KTR_IW_CXGBE, "%s addr 0x%x size %d", __func__, addr, size); mutex_lock(&rdev->stats.lock); rdev->stats.pbl.cur -= roundup(size, 1 << MIN_PBL_SHIFT); mutex_unlock(&rdev->stats.lock); - vmem_xfree(rdev->pbl_arena, addr, roundup(size,(1 << MIN_PBL_SHIFT))); -} - -int c4iw_pblpool_create(struct c4iw_rdev *rdev) -{ - rdev->pbl_arena = vmem_create("PBL_MEM_POOL", - rdev->adap->vres.pbl.start, - rdev->adap->vres.pbl.size, - 1, 0, M_FIRSTFIT| M_NOWAIT); - if (!rdev->pbl_arena) - return -ENOMEM; - - return 0; -} - -void c4iw_pblpool_destroy(struct c4iw_rdev *rdev) -{ - vmem_destroy(rdev->pbl_arena); + t4_pblpool_free(rdev->adap, addr, size); } /* RQT Memory Manager. */ diff --git a/sys/dev/cxgbe/iw_cxgbe/t4.h b/sys/dev/cxgbe/iw_cxgbe/t4.h index 48f85cf7965b..ffb610420640 100644 --- a/sys/dev/cxgbe/iw_cxgbe/t4.h +++ b/sys/dev/cxgbe/iw_cxgbe/t4.h @@ -64,7 +64,6 @@ #define T4_MAX_NUM_PD 65536 #define T4_MAX_MR_SIZE (~0ULL) #define T4_PAGESIZE_MASK 0xffffffff000 /* 4KB-8TB */ -#define T4_STAG_UNSET 0xffffffff #define T4_FW_MAJ 0 #define A_PCIE_MA_SYNC 0x30b4 diff --git a/sys/dev/cxgbe/t4_main.c b/sys/dev/cxgbe/t4_main.c index e9cb56562abe..388c5c104d19 100644 --- a/sys/dev/cxgbe/t4_main.c +++ b/sys/dev/cxgbe/t4_main.c @@ -1655,6 +1655,7 @@ t4_attach(device_t dev) if (sc->vres.key.size != 0) sc->key_map = vmem_create("T4TLS key map", sc->vres.key.start, sc->vres.key.size, 32, 0, M_FIRSTFIT | M_WAITOK); + t4_init_tpt(sc); /* * Second pass over the ports. This time we know the number of rx and @@ -1956,6 +1957,7 @@ t4_detach_common(device_t dev) #endif if (sc->key_map) vmem_destroy(sc->key_map); + t4_free_tpt(sc); #ifdef INET6 t4_destroy_clip_table(sc); #endif diff --git a/sys/dev/cxgbe/t4_tpt.c b/sys/dev/cxgbe/t4_tpt.c new file mode 100644 index 000000000000..d18eabb026f1 --- /dev/null +++ b/sys/dev/cxgbe/t4_tpt.c @@ -0,0 +1,193 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause + * + * Copyright (c) 2023 Chelsio Communications, Inc. + * Written by: John Baldwin + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include "common/common.h" + +/* + * Support routines to manage TPT entries used for both RDMA and NVMe + * offloads. This includes allocating STAG indices and managing the + * PBL pool. + */ + +#define T4_ULPTX_MIN_IO 32 +#define T4_MAX_INLINE_SIZE 96 +#define T4_ULPTX_MAX_DMA 1024 + +/* PBL and STAG Memory Managers. */ + +#define MIN_PBL_SHIFT 5 /* 32B == min PBL size (4 entries) */ + +uint32_t +t4_pblpool_alloc(struct adapter *sc, int size) +{ + vmem_addr_t addr; + + if (vmem_xalloc(sc->pbl_arena, roundup(size, (1 << MIN_PBL_SHIFT)), + 4, 0, 0, VMEM_ADDR_MIN, VMEM_ADDR_MAX, M_FIRSTFIT | M_NOWAIT, + &addr) != 0) + return (0); +#ifdef VERBOSE_TRACES + CTR(KTR_CXGBE, "%s: addr 0x%lx size %d", __func__, addr, size); +#endif + return (addr); +} + +void +t4_pblpool_free(struct adapter *sc, uint32_t addr, int size) +{ +#ifdef VERBOSE_TRACES + CTR(KTR_CXGBE, "%s: addr 0x%x size %d", __func__, addr, size); +#endif + vmem_xfree(sc->pbl_arena, addr, roundup(size, (1 << MIN_PBL_SHIFT))); +} + +uint32_t +t4_stag_alloc(struct adapter *sc, int size) +{ + vmem_addr_t stag_idx; + + if (vmem_alloc(sc->stag_arena, size, M_FIRSTFIT | M_NOWAIT, + &stag_idx) != 0) + return (T4_STAG_UNSET); +#ifdef VERBOSE_TRACES + CTR(KTR_CXGBE, "%s: idx 0x%lx size %d", __func__, stag_idx, size); +#endif + return (stag_idx); +} + +void +t4_stag_free(struct adapter *sc, uint32_t stag_idx, int size) +{ +#ifdef VERBOSE_TRACES + CTR(KTR_CXGBE, "%s: idx 0x%x size %d", __func__, stag_idx, size); +#endif + vmem_free(sc->stag_arena, stag_idx, size); +} + +void +t4_init_tpt(struct adapter *sc) +{ + if (sc->vres.pbl.size != 0) + sc->pbl_arena = vmem_create("PBL_MEM_POOL", sc->vres.pbl.start, + sc->vres.pbl.size, 1, 0, M_FIRSTFIT | M_WAITOK); + if (sc->vres.stag.size != 0) + sc->stag_arena = vmem_create("STAG", 1, + sc->vres.stag.size >> 5, 1, 0, M_FIRSTFIT | M_WAITOK); +} + +void +t4_free_tpt(struct adapter *sc) +{ + if (sc->pbl_arena != NULL) + vmem_destroy(sc->pbl_arena); + if (sc->stag_arena != NULL) + vmem_destroy(sc->stag_arena); +} + +/* + * TPT support routines. TPT entries are stored in the STAG adapter + * memory region and are written to via ULP_TX_MEM_WRITE commands in + * FW_ULPTX_WR work requests. + */ + +void +t4_write_mem_dma_wr(struct adapter *sc, void *wr, int wr_len, int tid, + uint32_t addr, uint32_t len, vm_paddr_t data, uint64_t cookie) +{ + struct ulp_mem_io *ulpmc; + struct ulptx_sgl *sgl; + + MPASS(wr_len == T4_WRITE_MEM_DMA_LEN); + + addr &= 0x7FFFFFF; + + memset(wr, 0, wr_len); + ulpmc = wr; + INIT_ULPTX_WR(ulpmc, wr_len, 0, tid); + if (cookie != 0) { + ulpmc->wr.wr_hi |= htobe32(F_FW_WR_COMPL); + ulpmc->wr.wr_lo = cookie; + } + ulpmc->cmd = htobe32(V_ULPTX_CMD(ULP_TX_MEM_WRITE) | + V_T5_ULP_MEMIO_ORDER(1) | + V_T5_ULP_MEMIO_FID(sc->sge.ofld_rxq[0].iq.abs_id)); + if (chip_id(sc) >= CHELSIO_T7) + ulpmc->dlen = htobe32(V_T7_ULP_MEMIO_DATA_LEN(len >> 5)); + else + ulpmc->dlen = htobe32(V_ULP_MEMIO_DATA_LEN(len >> 5)); + ulpmc->len16 = htobe32((tid << 8) | + DIV_ROUND_UP(wr_len - sizeof(ulpmc->wr), 16)); + ulpmc->lock_addr = htobe32(V_ULP_MEMIO_ADDR(addr)); + + sgl = (struct ulptx_sgl *)(ulpmc + 1); + sgl->cmd_nsge = htobe32(V_ULPTX_CMD(ULP_TX_SC_DSGL) | V_ULPTX_NSGE(1)); + sgl->len0 = htobe32(len); + sgl->addr0 = htobe64(data); +} + +void +t4_write_mem_inline_wr(struct adapter *sc, void *wr, int wr_len, int tid, + uint32_t addr, uint32_t len, void *data, uint64_t cookie) +{ + struct ulp_mem_io *ulpmc; + struct ulptx_idata *ulpsc; + + MPASS(len > 0 && len <= T4_MAX_INLINE_SIZE); + MPASS(wr_len == T4_WRITE_MEM_INLINE_LEN(len)); + + addr &= 0x7FFFFFF; + + memset(wr, 0, wr_len); + ulpmc = wr; + INIT_ULPTX_WR(ulpmc, wr_len, 0, tid); + + if (cookie != 0) { + ulpmc->wr.wr_hi |= htobe32(F_FW_WR_COMPL); + ulpmc->wr.wr_lo = cookie; + } + + ulpmc->cmd = htobe32(V_ULPTX_CMD(ULP_TX_MEM_WRITE) | + F_T5_ULP_MEMIO_IMM); + + if (chip_id(sc) >= CHELSIO_T7) + ulpmc->dlen = htobe32(V_T7_ULP_MEMIO_DATA_LEN( + DIV_ROUND_UP(len, T4_ULPTX_MIN_IO))); + else + ulpmc->dlen = htobe32(V_ULP_MEMIO_DATA_LEN( + DIV_ROUND_UP(len, T4_ULPTX_MIN_IO))); + ulpmc->len16 = htobe32((tid << 8) | + DIV_ROUND_UP(wr_len - sizeof(ulpmc->wr), 16)); + ulpmc->lock_addr = htobe32(V_ULP_MEMIO_ADDR(addr)); + + ulpsc = (struct ulptx_idata *)(ulpmc + 1); + ulpsc->cmd_more = htobe32(V_ULPTX_CMD(ULP_TX_SC_IMM)); + ulpsc->len = htobe32(roundup(len, T4_ULPTX_MIN_IO)); + + if (data != NULL) + memcpy(ulpsc + 1, data, len); +} diff --git a/sys/dev/cxgbe/tom/t4_tom.h b/sys/dev/cxgbe/tom/t4_tom.h index b8aba17c07bb..c8c2d432b8f1 100644 --- a/sys/dev/cxgbe/tom/t4_tom.h +++ b/sys/dev/cxgbe/tom/t4_tom.h @@ -586,4 +586,10 @@ int tls_tx_key(struct toepcb *); void tls_uninit_toep(struct toepcb *); int tls_alloc_ktls(struct toepcb *, struct ktls_session *, int); +/* t4_tpt.c */ +uint32_t t4_pblpool_alloc(struct adapter *, int); +void t4_pblpool_free(struct adapter *, uint32_t, int); +int t4_pblpool_create(struct adapter *); +void t4_pblpool_destroy(struct adapter *); + #endif diff --git a/sys/modules/cxgbe/if_cxgbe/Makefile b/sys/modules/cxgbe/if_cxgbe/Makefile index 981c3466c452..c8ba9df381a4 100644 --- a/sys/modules/cxgbe/if_cxgbe/Makefile +++ b/sys/modules/cxgbe/if_cxgbe/Makefile @@ -31,6 +31,7 @@ SRCS+= t4_netmap.c SRCS+= t4_sched.c SRCS+= t4_sge.c SRCS+= t4_smt.c +SRCS+= t4_tpt.c SRCS+= t4_tracer.c SRCS+= cudbg_common.c SRCS+= cudbg_flash_utils.c