From owner-svn-src-stable-11@freebsd.org Thu Aug 2 08:49:37 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1CF3C105F530; Thu, 2 Aug 2018 08:49:37 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C66A27DC3A; Thu, 2 Aug 2018 08:49:36 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id A789C75DC; Thu, 2 Aug 2018 08:49:36 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w728naXM056812; Thu, 2 Aug 2018 08:49:36 GMT (envelope-from hselasky@FreeBSD.org) Received: (from hselasky@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w728naAe056810; Thu, 2 Aug 2018 08:49:36 GMT (envelope-from hselasky@FreeBSD.org) Message-Id: <201808020849.w728naAe056810@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: hselasky set sender to hselasky@FreeBSD.org using -f From: Hans Petter Selasky Date: Thu, 2 Aug 2018 08:49:36 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r337110 - stable/11/sys/dev/mlx5/mlx5_en X-SVN-Group: stable-11 X-SVN-Commit-Author: hselasky X-SVN-Commit-Paths: stable/11/sys/dev/mlx5/mlx5_en X-SVN-Commit-Revision: 337110 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Aug 2018 08:49:37 -0000 Author: hselasky Date: Thu Aug 2 08:49:35 2018 New Revision: 337110 URL: https://svnweb.freebsd.org/changeset/base/337110 Log: MFC r336407: Handle jumbo frames without requiring big clusters in mlx5en(4). The scatter list is formed by the chunks of MCLBYTES each, and larger than default packets are returned to the stack as the mbuf chain. Submitted by: kib@ Sponsored by: Mellanox Technologies Modified: stable/11/sys/dev/mlx5/mlx5_en/en.h stable/11/sys/dev/mlx5/mlx5_en/mlx5_en_main.c stable/11/sys/dev/mlx5/mlx5_en/mlx5_en_rx.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/dev/mlx5/mlx5_en/en.h ============================================================================== --- stable/11/sys/dev/mlx5/mlx5_en/en.h Thu Aug 2 08:48:27 2018 (r337109) +++ stable/11/sys/dev/mlx5/mlx5_en/en.h Thu Aug 2 08:49:35 2018 (r337110) @@ -80,8 +80,19 @@ #define MLX5E_PARAMS_DEFAULT_LOG_RQ_SIZE 0xa #define MLX5E_PARAMS_MAXIMUM_LOG_RQ_SIZE 0xe -/* freeBSD HW LRO is limited by 16KB - the size of max mbuf */ +#define MLX5E_MAX_RX_SEGS 7 + +#ifndef MLX5E_MAX_RX_BYTES +#define MLX5E_MAX_RX_BYTES MCLBYTES +#endif + +#if (MLX5E_MAX_RX_SEGS == 1) +/* FreeBSD HW LRO is limited by 16KB - the size of max mbuf */ #define MLX5E_PARAMS_DEFAULT_LRO_WQE_SZ MJUM16BYTES +#else +#define MLX5E_PARAMS_DEFAULT_LRO_WQE_SZ \ + MIN(65535, MLX5E_MAX_RX_SEGS * MLX5E_MAX_RX_BYTES) +#endif #define MLX5E_PARAMS_DEFAULT_RX_CQ_MODERATION_USEC 0x10 #define MLX5E_PARAMS_DEFAULT_RX_CQ_MODERATION_USEC_FROM_CQE 0x3 #define MLX5E_PARAMS_DEFAULT_RX_CQ_MODERATION_PKTS 0x20 @@ -532,6 +543,7 @@ struct mlx5e_rq { struct mtx mtx; bus_dma_tag_t dma_tag; u32 wqe_sz; + u32 nsegs; struct mlx5e_rq_mbuf *mbuf; struct ifnet *ifp; struct mlx5e_rq_stats stats; @@ -782,8 +794,11 @@ struct mlx5e_tx_wqe { struct mlx5e_rx_wqe { struct mlx5_wqe_srq_next_seg next; - struct mlx5_wqe_data_seg data; + struct mlx5_wqe_data_seg data[]; }; + +/* the size of the structure above must be power of two */ +CTASSERT(powerof2(sizeof(struct mlx5e_rx_wqe))); struct mlx5e_eeprom { int lock_bit; Modified: stable/11/sys/dev/mlx5/mlx5_en/mlx5_en_main.c ============================================================================== --- stable/11/sys/dev/mlx5/mlx5_en/mlx5_en_main.c Thu Aug 2 08:48:27 2018 (r337109) +++ stable/11/sys/dev/mlx5/mlx5_en/mlx5_en_main.c Thu Aug 2 08:49:35 2018 (r337110) @@ -33,9 +33,12 @@ #ifndef ETH_DRIVER_VERSION #define ETH_DRIVER_VERSION "3.4.1" #endif + char mlx5e_version[] = "Mellanox Ethernet driver" " (" ETH_DRIVER_VERSION ")"; +static int mlx5e_get_wqe_sz(struct mlx5e_priv *priv, u32 *wqe_sz, u32 *nsegs); + struct mlx5e_channel_param { struct mlx5e_rq_param rq; struct mlx5e_sq_param sq; @@ -705,7 +708,12 @@ mlx5e_create_rq(struct mlx5e_channel *c, int wq_sz; int err; int i; + u32 nsegs, wqe_sz; + err = mlx5e_get_wqe_sz(priv, &wqe_sz, &nsegs); + if (err != 0) + goto done; + /* Create DMA descriptor TAG */ if ((err = -bus_dma_tag_create( bus_get_dma_tag(mdev->pdev->dev.bsddev), @@ -714,9 +722,9 @@ mlx5e_create_rq(struct mlx5e_channel *c, BUS_SPACE_MAXADDR, /* lowaddr */ BUS_SPACE_MAXADDR, /* highaddr */ NULL, NULL, /* filter, filterarg */ - MJUM16BYTES, /* maxsize */ - 1, /* nsegments */ - MJUM16BYTES, /* maxsegsize */ + nsegs * MLX5E_MAX_RX_BYTES, /* maxsize */ + nsegs, /* nsegments */ + nsegs * MLX5E_MAX_RX_BYTES, /* maxsegsize */ 0, /* flags */ NULL, NULL, /* lockfunc, lockfuncarg */ &rq->dma_tag))) @@ -729,23 +737,9 @@ mlx5e_create_rq(struct mlx5e_channel *c, rq->wq.db = &rq->wq.db[MLX5_RCV_DBR]; - if (priv->params.hw_lro_en) { - rq->wqe_sz = priv->params.lro_wqe_sz; - } else { - rq->wqe_sz = MLX5E_SW2MB_MTU(priv->ifp->if_mtu); - } - if (rq->wqe_sz > MJUM16BYTES) { - err = -ENOMEM; + err = mlx5e_get_wqe_sz(priv, &rq->wqe_sz, &rq->nsegs); + if (err != 0) goto err_rq_wq_destroy; - } else if (rq->wqe_sz > MJUM9BYTES) { - rq->wqe_sz = MJUM16BYTES; - } else if (rq->wqe_sz > MJUMPAGESIZE) { - rq->wqe_sz = MJUM9BYTES; - } else if (rq->wqe_sz > MCLBYTES) { - rq->wqe_sz = MJUMPAGESIZE; - } else { - rq->wqe_sz = MCLBYTES; - } wq_sz = mlx5_wq_ll_get_size(&rq->wq); @@ -756,7 +750,11 @@ mlx5e_create_rq(struct mlx5e_channel *c, rq->mbuf = malloc(wq_sz * sizeof(rq->mbuf[0]), M_MLX5EN, M_WAITOK | M_ZERO); for (i = 0; i != wq_sz; i++) { struct mlx5e_rx_wqe *wqe = mlx5_wq_ll_get_wqe(&rq->wq, i); +#if (MLX5E_MAX_RX_SEGS == 1) uint32_t byte_count = rq->wqe_sz - MLX5E_NET_IP_ALIGN; +#else + int j; +#endif err = -bus_dmamap_create(rq->dma_tag, 0, &rq->mbuf[i].dma_map); if (err != 0) { @@ -764,8 +762,15 @@ mlx5e_create_rq(struct mlx5e_channel *c, bus_dmamap_destroy(rq->dma_tag, rq->mbuf[i].dma_map); goto err_rq_mbuf_free; } - wqe->data.lkey = c->mkey_be; - wqe->data.byte_count = cpu_to_be32(byte_count | MLX5_HW_START_PADDING); + + /* set value for constant fields */ +#if (MLX5E_MAX_RX_SEGS == 1) + wqe->data[0].lkey = c->mkey_be; + wqe->data[0].byte_count = cpu_to_be32(byte_count | MLX5_HW_START_PADDING); +#else + for (j = 0; j < rq->nsegs; j++) + wqe->data[j].lkey = c->mkey_be; +#endif } rq->ifp = c->ifp; @@ -1704,16 +1709,51 @@ mlx5e_close_channel_wait(struct mlx5e_channel *volatil free(c, M_MLX5EN); } +static int +mlx5e_get_wqe_sz(struct mlx5e_priv *priv, u32 *wqe_sz, u32 *nsegs) +{ + u32 r, n; + + r = priv->params.hw_lro_en ? priv->params.lro_wqe_sz : + MLX5E_SW2MB_MTU(priv->ifp->if_mtu); + if (r > MJUM16BYTES) + return (-ENOMEM); + + if (r > MJUM9BYTES) + r = MJUM16BYTES; + else if (r > MJUMPAGESIZE) + r = MJUM9BYTES; + else if (r > MCLBYTES) + r = MJUMPAGESIZE; + else + r = MCLBYTES; + + /* + * n + 1 must be a power of two, because stride size must be. + * Stride size is 16 * (n + 1), as the first segment is + * control. + */ + for (n = howmany(r, MLX5E_MAX_RX_BYTES); !powerof2(n + 1); n++) + ; + + *wqe_sz = r; + *nsegs = n; + return (0); +} + static void mlx5e_build_rq_param(struct mlx5e_priv *priv, struct mlx5e_rq_param *param) { void *rqc = param->rqc; void *wq = MLX5_ADDR_OF(rqc, rqc, wq); + u32 wqe_sz, nsegs; + mlx5e_get_wqe_sz(priv, &wqe_sz, &nsegs); MLX5_SET(wq, wq, wq_type, MLX5_WQ_TYPE_LINKED_LIST); MLX5_SET(wq, wq, end_padding_mode, MLX5_WQ_END_PAD_MODE_ALIGN); - MLX5_SET(wq, wq, log_wq_stride, ilog2(sizeof(struct mlx5e_rx_wqe))); + MLX5_SET(wq, wq, log_wq_stride, ilog2(sizeof(struct mlx5e_rx_wqe) + + nsegs * sizeof(struct mlx5_wqe_data_seg))); MLX5_SET(wq, wq, log_wq_sz, priv->params.log_rq_size); MLX5_SET(wq, wq, pd, priv->pdn); Modified: stable/11/sys/dev/mlx5/mlx5_en/mlx5_en_rx.c ============================================================================== --- stable/11/sys/dev/mlx5/mlx5_en/mlx5_en_rx.c Thu Aug 2 08:48:27 2018 (r337109) +++ stable/11/sys/dev/mlx5/mlx5_en/mlx5_en_rx.c Thu Aug 2 08:49:35 2018 (r337110) @@ -32,21 +32,47 @@ static inline int mlx5e_alloc_rx_wqe(struct mlx5e_rq *rq, struct mlx5e_rx_wqe *wqe, u16 ix) { - bus_dma_segment_t segs[1]; + bus_dma_segment_t segs[rq->nsegs]; struct mbuf *mb; int nsegs; int err; - +#if (MLX5E_MAX_RX_SEGS != 1) + struct mbuf *mb_head; + int i; +#endif if (rq->mbuf[ix].mbuf != NULL) return (0); +#if (MLX5E_MAX_RX_SEGS == 1) mb = m_getjcl(M_NOWAIT, MT_DATA, M_PKTHDR, rq->wqe_sz); if (unlikely(!mb)) return (-ENOMEM); - /* set initial mbuf length */ mb->m_pkthdr.len = mb->m_len = rq->wqe_sz; +#else + mb_head = mb = m_getjcl(M_NOWAIT, MT_DATA, M_PKTHDR, + MLX5E_MAX_RX_BYTES); + if (unlikely(mb == NULL)) + return (-ENOMEM); + mb->m_len = MLX5E_MAX_RX_BYTES; + mb->m_pkthdr.len = MLX5E_MAX_RX_BYTES; + + for (i = 1; i < rq->nsegs; i++) { + if (mb_head->m_pkthdr.len >= rq->wqe_sz) + break; + mb = mb->m_next = m_getjcl(M_NOWAIT, MT_DATA, 0, + MLX5E_MAX_RX_BYTES); + if (unlikely(mb == NULL)) { + m_freem(mb_head); + return (-ENOMEM); + } + mb->m_len = MLX5E_MAX_RX_BYTES; + mb_head->m_pkthdr.len += MLX5E_MAX_RX_BYTES; + } + /* rewind to first mbuf in chain */ + mb = mb_head; +#endif /* get IP header aligned */ m_adj(mb, MLX5E_NET_IP_ALIGN); @@ -54,12 +80,26 @@ mlx5e_alloc_rx_wqe(struct mlx5e_rq *rq, mb, segs, &nsegs, BUS_DMA_NOWAIT); if (err != 0) goto err_free_mbuf; - if (unlikely(nsegs != 1)) { + if (unlikely(nsegs == 0)) { bus_dmamap_unload(rq->dma_tag, rq->mbuf[ix].dma_map); err = -ENOMEM; goto err_free_mbuf; } - wqe->data.addr = cpu_to_be64(segs[0].ds_addr); +#if (MLX5E_MAX_RX_SEGS == 1) + wqe->data[0].addr = cpu_to_be64(segs[0].ds_addr); +#else + wqe->data[0].addr = cpu_to_be64(segs[0].ds_addr); + wqe->data[0].byte_count = cpu_to_be32(segs[0].ds_len | + MLX5_HW_START_PADDING); + for (i = 1; i != nsegs; i++) { + wqe->data[i].addr = cpu_to_be64(segs[i].ds_addr); + wqe->data[i].byte_count = cpu_to_be32(segs[i].ds_len); + } + for (; i < rq->nsegs; i++) { + wqe->data[i].addr = 0; + wqe->data[i].byte_count = 0; + } +#endif rq->mbuf[ix].mbuf = mb; rq->mbuf[ix].data = mb->m_data; @@ -185,6 +225,9 @@ mlx5e_build_rx_mbuf(struct mlx5_cqe64 *cqe, u32 cqe_bcnt) { struct ifnet *ifp = rq->ifp; +#if (MLX5E_MAX_RX_SEGS != 1) + struct mbuf *mb_head; +#endif int lro_num_seg; /* HW LRO session aggregated packets counter */ lro_num_seg = be32_to_cpu(cqe->srqn) >> 24; @@ -194,7 +237,26 @@ mlx5e_build_rx_mbuf(struct mlx5_cqe64 *cqe, rq->stats.lro_bytes += cqe_bcnt; } +#if (MLX5E_MAX_RX_SEGS == 1) mb->m_pkthdr.len = mb->m_len = cqe_bcnt; +#else + mb->m_pkthdr.len = cqe_bcnt; + for (mb_head = mb; mb != NULL; mb = mb->m_next) { + if (mb->m_len > cqe_bcnt) + mb->m_len = cqe_bcnt; + cqe_bcnt -= mb->m_len; + if (likely(cqe_bcnt == 0)) { + if (likely(mb->m_next != NULL)) { + /* trim off empty mbufs */ + m_freem(mb->m_next); + mb->m_next = NULL; + } + break; + } + } + /* rewind to first mbuf in chain */ + mb = mb_head; +#endif /* check if a Toeplitz hash was computed */ if (cqe->rss_hash_type != 0) { mb->m_pkthdr.flowid = be32_to_cpu(cqe->rss_hash_result); @@ -357,6 +419,10 @@ mlx5e_poll_rx_cq(struct mlx5e_rq *rq, int budget) } if ((MHLEN - MLX5E_NET_IP_ALIGN) >= byte_cnt && (mb = m_gethdr(M_NOWAIT, MT_DATA)) != NULL) { +#if (MLX5E_MAX_RX_SEGS != 1) + /* set maximum mbuf length */ + mb->m_len = MHLEN - MLX5E_NET_IP_ALIGN; +#endif /* get IP header aligned */ mb->m_data += MLX5E_NET_IP_ALIGN;