From owner-svn-src-stable@freebsd.org Wed Dec 12 13:16:41 2018 Return-Path: Delivered-To: svn-src-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D3B9A130EF9E; Wed, 12 Dec 2018 13:16:40 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 82087948DE; Wed, 12 Dec 2018 13:16:40 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 73C3C872A; Wed, 12 Dec 2018 13:16:40 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id wBCDGeki011303; Wed, 12 Dec 2018 13:16:40 GMT (envelope-from hselasky@FreeBSD.org) Received: (from hselasky@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id wBCDGdOV011295; Wed, 12 Dec 2018 13:16:39 GMT (envelope-from hselasky@FreeBSD.org) Message-Id: <201812121316.wBCDGdOV011295@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: hselasky set sender to hselasky@FreeBSD.org using -f From: Hans Petter Selasky Date: Wed, 12 Dec 2018 13:16:39 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-12@freebsd.org Subject: svn commit: r341985 - stable/12/sys/dev/mlx5/mlx5_en X-SVN-Group: stable-12 X-SVN-Commit-Author: hselasky X-SVN-Commit-Paths: stable/12/sys/dev/mlx5/mlx5_en X-SVN-Commit-Revision: 341985 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 82087948DE X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-0.61 / 15.00]; local_wl_from(0.00)[FreeBSD.org]; NEURAL_HAM_SHORT(-0.61)[-0.611,0]; ASN(0.00)[asn:11403, ipnet:2610:1c1:1::/48, country:US] X-BeenThere: svn-src-stable@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: SVN commit messages for all the -stable branches of the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Dec 2018 13:16:41 -0000 Author: hselasky Date: Wed Dec 12 13:16:39 2018 New Revision: 341985 URL: https://svnweb.freebsd.org/changeset/base/341985 Log: MFC r341586: mlx5en: Implement backpressure indication. The backpressure indication is implemented using an unlimited rate type of mbuf send tag. When the upper layers typically the socket layer has obtained such a tag, it can then query the destination driver queue for the current amount of space available in the send queue. A single mbuf send tag may be referenced multiple times and a refcount has been added to the mlx5e_priv structure to track its usage. Because the send tag resides in the mlx5e_channel structure, there is no need to wait for refcounts to reach zero until the mlx4en(4) driver is detached. The channels structure is persistant during the lifetime of the mlx5en(4) driver it belongs to and can so be accessed without any need of synchronization. The mlx5e_snd_tag structure was extended to contain a type field, because there are now two different tag types which end up in the driver which need to be distinguished. Sponsored by: Mellanox Technologies Modified: stable/12/sys/dev/mlx5/mlx5_en/en.h stable/12/sys/dev/mlx5/mlx5_en/en_rl.h stable/12/sys/dev/mlx5/mlx5_en/mlx5_en_main.c stable/12/sys/dev/mlx5/mlx5_en/mlx5_en_rl.c stable/12/sys/dev/mlx5/mlx5_en/mlx5_en_tx.c Directory Properties: stable/12/ (props changed) Modified: stable/12/sys/dev/mlx5/mlx5_en/en.h ============================================================================== --- stable/12/sys/dev/mlx5/mlx5_en/en.h Wed Dec 12 13:14:41 2018 (r341984) +++ stable/12/sys/dev/mlx5/mlx5_en/en.h Wed Dec 12 13:16:39 2018 (r341985) @@ -580,6 +580,11 @@ enum { MLX5E_SQ_FULL }; +struct mlx5e_snd_tag { + struct m_snd_tag m_snd_tag; /* send tag */ + u32 type; /* tag type */ +}; + struct mlx5e_sq { /* data path */ struct mtx lock; @@ -640,11 +645,27 @@ mlx5e_sq_has_room_for(struct mlx5e_sq *sq, u16 n) return ((sq->wq.sz_m1 & (cc - pc)) >= n || cc == pc); } +static inline u32 +mlx5e_sq_queue_level(struct mlx5e_sq *sq) +{ + u16 cc; + u16 pc; + + if (sq == NULL) + return (0); + + cc = sq->cc; + pc = sq->pc; + + return (((sq->wq.sz_m1 & (pc - cc)) * + IF_SND_QUEUE_LEVEL_MAX) / sq->wq.sz_m1); +} + struct mlx5e_channel { /* data path */ struct mlx5e_rq rq; + struct mlx5e_snd_tag tag; struct mlx5e_sq sq[MLX5E_MAX_TX_NUM_TC]; - struct ifnet *ifp; u32 mkey_be; u8 num_tc; @@ -770,6 +791,7 @@ struct mlx5e_priv { u32 pdn; u32 tdn; struct mlx5_core_mr mr; + volatile unsigned int channel_refs; u32 tisn[MLX5E_MAX_TX_NUM_TC]; u32 rqtn; @@ -907,6 +929,24 @@ mlx5e_cq_arm(struct mlx5e_cq *cq, spinlock_t *dblock) mcq = &cq->mcq; mlx5_cq_arm(mcq, MLX5_CQ_DB_REQ_NOT, mcq->uar->map, dblock, cq->wq.cc); +} + +static inline void +mlx5e_ref_channel(struct mlx5e_priv *priv) +{ + + KASSERT(priv->channel_refs < INT_MAX, + ("Channel refs will overflow")); + atomic_fetchadd_int(&priv->channel_refs, 1); +} + +static inline void +mlx5e_unref_channel(struct mlx5e_priv *priv) +{ + + KASSERT(priv->channel_refs > 0, + ("Channel refs is not greater than zero")); + atomic_fetchadd_int(&priv->channel_refs, -1); } extern const struct ethtool_ops mlx5e_ethtool_ops; Modified: stable/12/sys/dev/mlx5/mlx5_en/en_rl.h ============================================================================== --- stable/12/sys/dev/mlx5/mlx5_en/en_rl.h Wed Dec 12 13:14:41 2018 (r341984) +++ stable/12/sys/dev/mlx5/mlx5_en/en_rl.h Wed Dec 12 13:16:39 2018 (r341985) @@ -129,7 +129,7 @@ struct mlx5e_rl_channel_param { }; struct mlx5e_rl_channel { - struct m_snd_tag m_snd_tag; + struct mlx5e_snd_tag tag; STAILQ_ENTRY(mlx5e_rl_channel) entry; struct mlx5e_sq * volatile sq; struct mlx5e_rl_worker *worker; Modified: stable/12/sys/dev/mlx5/mlx5_en/mlx5_en_main.c ============================================================================== --- stable/12/sys/dev/mlx5/mlx5_en/mlx5_en_main.c Wed Dec 12 13:14:41 2018 (r341984) +++ stable/12/sys/dev/mlx5/mlx5_en/mlx5_en_main.c Wed Dec 12 13:16:39 2018 (r341985) @@ -884,7 +884,7 @@ mlx5e_create_rq(struct mlx5e_channel *c, wq_sz = mlx5_wq_ll_get_size(&rq->wq); - err = -tcp_lro_init_args(&rq->lro, c->ifp, TCP_LRO_ENTRIES, wq_sz); + err = -tcp_lro_init_args(&rq->lro, c->tag.m_snd_tag.ifp, TCP_LRO_ENTRIES, wq_sz); if (err) goto err_rq_wq_destroy; @@ -914,7 +914,7 @@ mlx5e_create_rq(struct mlx5e_channel *c, #endif } - rq->ifp = c->ifp; + rq->ifp = c->tag.m_snd_tag.ifp; rq->channel = c; rq->ix = c->ix; @@ -1771,7 +1771,9 @@ mlx5e_open_channel(struct mlx5e_priv *priv, int ix, c->priv = priv; c->ix = ix; - c->ifp = priv->ifp; + /* setup send tag */ + c->tag.m_snd_tag.ifp = priv->ifp; + c->tag.type = IF_SND_TAG_TYPE_UNLIMITED; c->mkey_be = cpu_to_be32(priv->mr.key); c->num_tc = priv->num_tc; @@ -2004,7 +2006,6 @@ mlx5e_open_channels(struct mlx5e_priv *priv) if (err) goto err_close_channels; } - return (0); err_close_channels: @@ -3518,6 +3519,141 @@ mlx5e_setup_pauseframes(struct mlx5e_priv *priv) PRIV_UNLOCK(priv); } +static int +mlx5e_ul_snd_tag_alloc(struct ifnet *ifp, + union if_snd_tag_alloc_params *params, + struct m_snd_tag **ppmt) +{ + struct mlx5e_priv *priv; + struct mlx5e_channel *pch; + + priv = ifp->if_softc; + + if (unlikely(priv->gone || params->hdr.flowtype == M_HASHTYPE_NONE)) { + return (EOPNOTSUPP); + } else { + /* keep this code synced with mlx5e_select_queue() */ + u32 ch = priv->params.num_channels; +#ifdef RSS + u32 temp; + + if (rss_hash2bucket(params->hdr.flowid, + params->hdr.flowtype, &temp) == 0) + ch = temp % ch; + else +#endif + ch = (params->hdr.flowid % 128) % ch; + + /* + * NOTE: The channels array is only freed at detach + * and it safe to return a pointer to the send tag + * inside the channels structure as long as we + * reference the priv. + */ + pch = priv->channel + ch; + + /* check if send queue is not running */ + if (unlikely(pch->sq[0].running == 0)) + return (ENXIO); + mlx5e_ref_channel(priv); + *ppmt = &pch->tag.m_snd_tag; + return (0); + } +} + +static int +mlx5e_ul_snd_tag_query(struct m_snd_tag *pmt, union if_snd_tag_query_params *params) +{ + struct mlx5e_channel *pch = + container_of(pmt, struct mlx5e_channel, tag.m_snd_tag); + + params->unlimited.max_rate = -1ULL; + params->unlimited.queue_level = mlx5e_sq_queue_level(&pch->sq[0]); + return (0); +} + +static void +mlx5e_ul_snd_tag_free(struct m_snd_tag *pmt) +{ + struct mlx5e_channel *pch = + container_of(pmt, struct mlx5e_channel, tag.m_snd_tag); + + mlx5e_unref_channel(pch->priv); +} + +static int +mlx5e_snd_tag_alloc(struct ifnet *ifp, + union if_snd_tag_alloc_params *params, + struct m_snd_tag **ppmt) +{ + + switch (params->hdr.type) { +#ifdef RATELIMIT + case IF_SND_TAG_TYPE_RATE_LIMIT: + return (mlx5e_rl_snd_tag_alloc(ifp, params, ppmt)); +#endif + case IF_SND_TAG_TYPE_UNLIMITED: + return (mlx5e_ul_snd_tag_alloc(ifp, params, ppmt)); + default: + return (EOPNOTSUPP); + } +} + +static int +mlx5e_snd_tag_modify(struct m_snd_tag *pmt, union if_snd_tag_modify_params *params) +{ + struct mlx5e_snd_tag *tag = + container_of(pmt, struct mlx5e_snd_tag, m_snd_tag); + + switch (tag->type) { +#ifdef RATELIMIT + case IF_SND_TAG_TYPE_RATE_LIMIT: + return (mlx5e_rl_snd_tag_modify(pmt, params)); +#endif + case IF_SND_TAG_TYPE_UNLIMITED: + default: + return (EOPNOTSUPP); + } +} + +static int +mlx5e_snd_tag_query(struct m_snd_tag *pmt, union if_snd_tag_query_params *params) +{ + struct mlx5e_snd_tag *tag = + container_of(pmt, struct mlx5e_snd_tag, m_snd_tag); + + switch (tag->type) { +#ifdef RATELIMIT + case IF_SND_TAG_TYPE_RATE_LIMIT: + return (mlx5e_rl_snd_tag_query(pmt, params)); +#endif + case IF_SND_TAG_TYPE_UNLIMITED: + return (mlx5e_ul_snd_tag_query(pmt, params)); + default: + return (EOPNOTSUPP); + } +} + +static void +mlx5e_snd_tag_free(struct m_snd_tag *pmt) +{ + struct mlx5e_snd_tag *tag = + container_of(pmt, struct mlx5e_snd_tag, m_snd_tag); + + switch (tag->type) { +#ifdef RATELIMIT + case IF_SND_TAG_TYPE_RATE_LIMIT: + mlx5e_rl_snd_tag_free(pmt); + break; +#endif + case IF_SND_TAG_TYPE_UNLIMITED: + mlx5e_ul_snd_tag_free(pmt); + break; + default: + break; + } +} + static void * mlx5e_create_ifp(struct mlx5_core_dev *mdev) { @@ -3571,13 +3707,11 @@ mlx5e_create_ifp(struct mlx5_core_dev *mdev) ifp->if_capabilities |= IFCAP_LRO; ifp->if_capabilities |= IFCAP_TSO | IFCAP_VLAN_HWTSO; ifp->if_capabilities |= IFCAP_HWSTATS | IFCAP_HWRXTSTMP; -#ifdef RATELIMIT ifp->if_capabilities |= IFCAP_TXRTLMT; - ifp->if_snd_tag_alloc = mlx5e_rl_snd_tag_alloc; - ifp->if_snd_tag_free = mlx5e_rl_snd_tag_free; - ifp->if_snd_tag_modify = mlx5e_rl_snd_tag_modify; - ifp->if_snd_tag_query = mlx5e_rl_snd_tag_query; -#endif + ifp->if_snd_tag_alloc = mlx5e_snd_tag_alloc; + ifp->if_snd_tag_free = mlx5e_snd_tag_free; + ifp->if_snd_tag_modify = mlx5e_snd_tag_modify; + ifp->if_snd_tag_query = mlx5e_snd_tag_query; /* set TSO limits so that we don't have to drop TX packets */ ifp->if_hw_tsomax = MLX5E_MAX_TX_PAYLOAD_SIZE - (ETHER_HDR_LEN + ETHER_VLAN_ENCAP_LEN); @@ -3831,6 +3965,13 @@ mlx5e_destroy_ifp(struct mlx5_core_dev *mdev, void *vp PRIV_LOCK(priv); mlx5e_close_locked(ifp); PRIV_UNLOCK(priv); + + /* wait for all unlimited send tags to go away */ + while (priv->channel_refs != 0) { + if_printf(priv->ifp, "Waiting for all unlimited connections " + "to terminate\n"); + pause("W", hz); + } /* unregister device */ ifmedia_removeall(&priv->media); Modified: stable/12/sys/dev/mlx5/mlx5_en/mlx5_en_rl.c ============================================================================== --- stable/12/sys/dev/mlx5/mlx5_en/mlx5_en_rl.c Wed Dec 12 13:14:41 2018 (r341984) +++ stable/12/sys/dev/mlx5/mlx5_en/mlx5_en_rl.c Wed Dec 12 13:16:39 2018 (r341985) @@ -841,7 +841,8 @@ mlx5e_rl_init(struct mlx5e_priv *priv) for (i = 0; i < rl->param.tx_channels_per_worker_def; i++) { struct mlx5e_rl_channel *channel = rlw->channels + i; channel->worker = rlw; - channel->m_snd_tag.ifp = priv->ifp; + channel->tag.m_snd_tag.ifp = priv->ifp; + channel->tag.type = IF_SND_TAG_TYPE_RATE_LIMIT; STAILQ_INSERT_TAIL(&rlw->index_list_head, channel, entry); } MLX5E_RL_WORKER_UNLOCK(rlw); @@ -1038,17 +1039,21 @@ mlx5e_rl_modify(struct mlx5e_rl_worker *rlw, struct ml } static int -mlx5e_rl_query(struct mlx5e_rl_worker *rlw, struct mlx5e_rl_channel *channel, uint64_t *prate) +mlx5e_rl_query(struct mlx5e_rl_worker *rlw, struct mlx5e_rl_channel *channel, + union if_snd_tag_query_params *params) { int retval; MLX5E_RL_WORKER_LOCK(rlw); switch (channel->state) { case MLX5E_RL_ST_USED: - *prate = channel->last_rate; + params->rate_limit.max_rate = channel->last_rate; + params->rate_limit.queue_level = mlx5e_sq_queue_level(channel->sq); retval = 0; break; case MLX5E_RL_ST_MODIFY: + params->rate_limit.max_rate = channel->last_rate; + params->rate_limit.queue_level = mlx5e_sq_queue_level(channel->sq); retval = EBUSY; break; default: @@ -1120,7 +1125,7 @@ mlx5e_rl_snd_tag_alloc(struct ifnet *ifp, } /* store pointer to mbuf tag */ - *ppmt = &channel->m_snd_tag; + *ppmt = &channel->tag.m_snd_tag; done: return (error); } @@ -1130,7 +1135,7 @@ int mlx5e_rl_snd_tag_modify(struct m_snd_tag *pmt, union if_snd_tag_modify_params *params) { struct mlx5e_rl_channel *channel = - container_of(pmt, struct mlx5e_rl_channel, m_snd_tag); + container_of(pmt, struct mlx5e_rl_channel, tag.m_snd_tag); return (mlx5e_rl_modify(channel->worker, channel, params->rate_limit.max_rate)); } @@ -1139,16 +1144,16 @@ int mlx5e_rl_snd_tag_query(struct m_snd_tag *pmt, union if_snd_tag_query_params *params) { struct mlx5e_rl_channel *channel = - container_of(pmt, struct mlx5e_rl_channel, m_snd_tag); + container_of(pmt, struct mlx5e_rl_channel, tag.m_snd_tag); - return (mlx5e_rl_query(channel->worker, channel, ¶ms->rate_limit.max_rate)); + return (mlx5e_rl_query(channel->worker, channel, params)); } void mlx5e_rl_snd_tag_free(struct m_snd_tag *pmt) { struct mlx5e_rl_channel *channel = - container_of(pmt, struct mlx5e_rl_channel, m_snd_tag); + container_of(pmt, struct mlx5e_rl_channel, tag.m_snd_tag); mlx5e_rl_free(channel->worker, channel); } Modified: stable/12/sys/dev/mlx5/mlx5_en/mlx5_en_tx.c ============================================================================== --- stable/12/sys/dev/mlx5/mlx5_en/mlx5_en_tx.c Wed Dec 12 13:14:41 2018 (r341984) +++ stable/12/sys/dev/mlx5/mlx5_en/mlx5_en_tx.c Wed Dec 12 13:16:39 2018 (r341985) @@ -78,6 +78,47 @@ SYSINIT(mlx5e_hash_init, SI_SUB_RANDOM, SI_ORDER_ANY, #endif static struct mlx5e_sq * +mlx5e_select_queue_by_send_tag(struct ifnet *ifp, struct mbuf *mb) +{ + struct mlx5e_snd_tag *ptag; + struct mlx5e_sq *sq; + + /* check for route change */ + if (mb->m_pkthdr.snd_tag->ifp != ifp) + return (NULL); + + /* get pointer to sendqueue */ + ptag = container_of(mb->m_pkthdr.snd_tag, + struct mlx5e_snd_tag, m_snd_tag); + + switch (ptag->type) { +#ifdef RATELIMIT + case IF_SND_TAG_TYPE_RATE_LIMIT: + sq = container_of(ptag, + struct mlx5e_rl_channel, tag)->sq; + break; +#endif + case IF_SND_TAG_TYPE_UNLIMITED: + sq = &container_of(ptag, + struct mlx5e_channel, tag)->sq[0]; + KASSERT(({ + struct mlx5e_priv *priv = ifp->if_softc; + priv->channel_refs > 0; }), + ("mlx5e_select_queue: Channel refs are zero for unlimited tag")); + break; + default: + sq = NULL; + break; + } + + /* check if valid */ + if (sq != NULL && READ_ONCE(sq->running) != 0) + return (sq); + + return (NULL); +} + +static struct mlx5e_sq * mlx5e_select_queue(struct ifnet *ifp, struct mbuf *mb) { struct mlx5e_priv *priv = ifp->if_softc; @@ -96,25 +137,6 @@ mlx5e_select_queue(struct ifnet *ifp, struct mbuf *mb) ch = priv->params.num_channels; -#ifdef RATELIMIT - if (mb->m_pkthdr.snd_tag != NULL) { - struct mlx5e_sq *sq; - - /* check for route change */ - if (mb->m_pkthdr.snd_tag->ifp != ifp) - return (NULL); - - /* get pointer to sendqueue */ - sq = container_of(mb->m_pkthdr.snd_tag, - struct mlx5e_rl_channel, m_snd_tag)->sq; - - /* check if valid */ - if (sq != NULL && sq->running != 0) - return (sq); - - /* FALLTHROUGH */ - } -#endif /* check if flowid is set */ if (M_HASHTYPE_GET(mb) != M_HASHTYPE_NONE) { #ifdef RSS @@ -587,27 +609,33 @@ mlx5e_xmit(struct ifnet *ifp, struct mbuf *mb) struct mlx5e_sq *sq; int ret; - sq = mlx5e_select_queue(ifp, mb); - if (unlikely(sq == NULL)) { -#ifdef RATELIMIT - /* Check for route change */ - if (mb->m_pkthdr.snd_tag != NULL && - mb->m_pkthdr.snd_tag->ifp != ifp) { + if (mb->m_pkthdr.snd_tag != NULL) { + sq = mlx5e_select_queue_by_send_tag(ifp, mb); + if (unlikely(sq == NULL)) { + /* Check for route change */ + if (mb->m_pkthdr.snd_tag->ifp != ifp) { + /* Free mbuf */ + m_freem(mb); + + /* + * Tell upper layers about route + * change and to re-transmit this + * packet: + */ + return (EAGAIN); + } + goto select_queue; + } + } else { +select_queue: + sq = mlx5e_select_queue(ifp, mb); + if (unlikely(sq == NULL)) { /* Free mbuf */ m_freem(mb); - /* - * Tell upper layers about route change and to - * re-transmit this packet: - */ - return (EAGAIN); + /* Invalid send queue */ + return (ENXIO); } -#endif - /* Free mbuf */ - m_freem(mb); - - /* Invalid send queue */ - return (ENXIO); } mtx_lock(&sq->lock);