From owner-svn-src-stable@freebsd.org Fri Mar 30 18:36:46 2018 Return-Path: Delivered-To: svn-src-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 62EA4F555F7; Fri, 30 Mar 2018 18:36:46 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1106875EF9; Fri, 30 Mar 2018 18:36:46 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 0A3354B2A; Fri, 30 Mar 2018 18:36:46 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w2UIakAN077548; Fri, 30 Mar 2018 18:36:46 GMT (envelope-from hselasky@FreeBSD.org) Received: (from hselasky@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w2UIait3077536; Fri, 30 Mar 2018 18:36:44 GMT (envelope-from hselasky@FreeBSD.org) Message-Id: <201803301836.w2UIait3077536@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: hselasky set sender to hselasky@FreeBSD.org using -f From: Hans Petter Selasky Date: Fri, 30 Mar 2018 18:36:44 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r331784 - in stable/11/sys: dev/cxgbe/iw_cxgbe dev/mlx4/mlx4_ib dev/mlx5/mlx5_ib dev/mthca ofed/drivers/infiniband/core ofed/include/rdma ofed/include/uapi/rdma X-SVN-Group: stable-11 X-SVN-Commit-Author: hselasky X-SVN-Commit-Paths: in stable/11/sys: dev/cxgbe/iw_cxgbe dev/mlx4/mlx4_ib dev/mlx5/mlx5_ib dev/mthca ofed/drivers/infiniband/core ofed/include/rdma ofed/include/uapi/rdma X-SVN-Commit-Revision: 331784 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: SVN commit messages for all the -stable branches of the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Mar 2018 18:36:46 -0000 Author: hselasky Date: Fri Mar 30 18:36:44 2018 New Revision: 331784 URL: https://svnweb.freebsd.org/changeset/base/331784 Log: MFC r330508: Optimize ibcore RoCE address handle creation from user-space. Creating a UD address handle from user-space or from the kernel-space, when the link layer is ethernet, requires resolving the remote L3 address into a L2 address. Doing this from the kernel is easy because the required ARP(IPv4) and ND6(IPv6) address resolving APIs are readily available. In userspace such an interface does not exist and kernel help is required. It should be noted that in an IP-based GID environment, the GID itself does not contain all the information needed to resolve the destination IP address. For example information like VLAN ID and SCOPE ID, is not part of the GID and must be fetched from the GID attributes. Therefore a source GID should always be referred to as a GID index. Instead of going through various racy steps to obtain information about the GID attributes from user-space, this is now all done by the kernel. This patch optimises the L3 to L2 address resolving using the existing create address handle uverbs interface, retrieving back the L2 address as an additional user-space information structure. This commit combines the following Linux upstream commits: IB/core: Let create_ah return extended response to user IB/core: Change ib_resolve_eth_dmac to use it in create AH IB/mlx5: Make create/destroy_ah available to userspace IB/mlx5: Use kernel driver to help userspace create ah IB/mlx5: Report that device has udata response in create_ah Sponsored by: Mellanox Technologies Modified: stable/11/sys/dev/cxgbe/iw_cxgbe/provider.c stable/11/sys/dev/mlx4/mlx4_ib/mlx4_ib.h stable/11/sys/dev/mlx4/mlx4_ib/mlx4_ib_ah.c stable/11/sys/dev/mlx5/mlx5_ib/mlx5_ib.h stable/11/sys/dev/mlx5/mlx5_ib/mlx5_ib_ah.c stable/11/sys/dev/mlx5/mlx5_ib/mlx5_ib_main.c stable/11/sys/dev/mthca/mthca_provider.c stable/11/sys/ofed/drivers/infiniband/core/core_priv.h stable/11/sys/ofed/drivers/infiniband/core/ib_uverbs_cmd.c stable/11/sys/ofed/drivers/infiniband/core/ib_verbs.c stable/11/sys/ofed/include/rdma/ib_verbs.h stable/11/sys/ofed/include/uapi/rdma/mlx5-abi.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/dev/cxgbe/iw_cxgbe/provider.c ============================================================================== --- stable/11/sys/dev/cxgbe/iw_cxgbe/provider.c Fri Mar 30 18:33:30 2018 (r331783) +++ stable/11/sys/dev/cxgbe/iw_cxgbe/provider.c Fri Mar 30 18:36:44 2018 (r331784) @@ -57,7 +57,8 @@ static int c4iw_modify_port(struct ib_device *ibdev, } static struct ib_ah *c4iw_ah_create(struct ib_pd *pd, - struct ib_ah_attr *ah_attr) + struct ib_ah_attr *ah_attr, + struct ib_udata *udata) { return ERR_PTR(-ENOSYS); } Modified: stable/11/sys/dev/mlx4/mlx4_ib/mlx4_ib.h ============================================================================== --- stable/11/sys/dev/mlx4/mlx4_ib/mlx4_ib.h Fri Mar 30 18:33:30 2018 (r331783) +++ stable/11/sys/dev/mlx4/mlx4_ib/mlx4_ib.h Fri Mar 30 18:36:44 2018 (r331784) @@ -752,7 +752,8 @@ int mlx4_ib_arm_cq(struct ib_cq *cq, enum ib_cq_notify void __mlx4_ib_cq_clean(struct mlx4_ib_cq *cq, u32 qpn, struct mlx4_ib_srq *srq); void mlx4_ib_cq_clean(struct mlx4_ib_cq *cq, u32 qpn, struct mlx4_ib_srq *srq); -struct ib_ah *mlx4_ib_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr); +struct ib_ah *mlx4_ib_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr, + struct ib_udata *udata); int mlx4_ib_query_ah(struct ib_ah *ibah, struct ib_ah_attr *ah_attr); int mlx4_ib_destroy_ah(struct ib_ah *ah); Modified: stable/11/sys/dev/mlx4/mlx4_ib/mlx4_ib_ah.c ============================================================================== --- stable/11/sys/dev/mlx4/mlx4_ib/mlx4_ib_ah.c Fri Mar 30 18:33:30 2018 (r331783) +++ stable/11/sys/dev/mlx4/mlx4_ib/mlx4_ib_ah.c Fri Mar 30 18:36:44 2018 (r331784) @@ -129,7 +129,9 @@ static struct ib_ah *create_iboe_ah(struct ib_pd *pd, return &ah->ibah; } -struct ib_ah *mlx4_ib_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr) +struct ib_ah *mlx4_ib_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr, + struct ib_udata *udata) + { struct mlx4_ib_ah *ah; struct ib_ah *ret; Modified: stable/11/sys/dev/mlx5/mlx5_ib/mlx5_ib.h ============================================================================== --- stable/11/sys/dev/mlx5/mlx5_ib/mlx5_ib.h Fri Mar 30 18:33:30 2018 (r331783) +++ stable/11/sys/dev/mlx5/mlx5_ib/mlx5_ib.h Fri Mar 30 18:36:44 2018 (r331784) @@ -734,7 +734,8 @@ void mlx5_ib_free_srq_wqe(struct mlx5_ib_srq *srq, int int mlx5_MAD_IFC(struct mlx5_ib_dev *dev, int ignore_mkey, int ignore_bkey, u8 port, const struct ib_wc *in_wc, const struct ib_grh *in_grh, const void *in_mad, void *response_mad); -struct ib_ah *mlx5_ib_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr); +struct ib_ah *mlx5_ib_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr, + struct ib_udata *udata); int mlx5_ib_query_ah(struct ib_ah *ibah, struct ib_ah_attr *ah_attr); int mlx5_ib_destroy_ah(struct ib_ah *ah); struct ib_srq *mlx5_ib_create_srq(struct ib_pd *pd, Modified: stable/11/sys/dev/mlx5/mlx5_ib/mlx5_ib_ah.c ============================================================================== --- stable/11/sys/dev/mlx5/mlx5_ib/mlx5_ib_ah.c Fri Mar 30 18:33:30 2018 (r331783) +++ stable/11/sys/dev/mlx5/mlx5_ib/mlx5_ib_ah.c Fri Mar 30 18:36:44 2018 (r331784) @@ -59,7 +59,9 @@ static struct ib_ah *create_ib_ah(struct mlx5_ib_dev * return &ah->ibah; } -struct ib_ah *mlx5_ib_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr) +struct ib_ah *mlx5_ib_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr, + struct ib_udata *udata) + { struct mlx5_ib_ah *ah; struct mlx5_ib_dev *dev = to_mdev(pd->device); @@ -69,6 +71,27 @@ struct ib_ah *mlx5_ib_create_ah(struct ib_pd *pd, stru if (ll == IB_LINK_LAYER_ETHERNET && !(ah_attr->ah_flags & IB_AH_GRH)) return ERR_PTR(-EINVAL); + + if (ll == IB_LINK_LAYER_ETHERNET && udata) { + int err; + struct mlx5_ib_create_ah_resp resp = {}; + u32 min_resp_len = offsetof(typeof(resp), dmac) + + sizeof(resp.dmac); + + if (udata->outlen < min_resp_len) + return ERR_PTR(-EINVAL); + + resp.response_length = min_resp_len; + + err = ib_resolve_eth_dmac(pd->device, ah_attr); + if (err) + return ERR_PTR(err); + + memcpy(resp.dmac, ah_attr->dmac, ETH_ALEN); + err = ib_copy_to_udata(udata, &resp, resp.response_length); + if (err) + return ERR_PTR(err); + } ah = kzalloc(sizeof(*ah), GFP_ATOMIC); if (!ah) Modified: stable/11/sys/dev/mlx5/mlx5_ib/mlx5_ib_main.c ============================================================================== --- stable/11/sys/dev/mlx5/mlx5_ib/mlx5_ib_main.c Fri Mar 30 18:33:30 2018 (r331783) +++ stable/11/sys/dev/mlx5/mlx5_ib/mlx5_ib_main.c Fri Mar 30 18:36:44 2018 (r331784) @@ -1081,7 +1081,8 @@ static struct ib_ucontext *mlx5_ib_alloc_ucontext(stru resp.response_length += sizeof(resp.cqe_version); if (field_avail(typeof(resp), cmds_supp_uhw, udata->outlen)) { - resp.cmds_supp_uhw |= MLX5_USER_CMDS_SUPP_UHW_QUERY_DEVICE; + resp.cmds_supp_uhw |= MLX5_USER_CMDS_SUPP_UHW_QUERY_DEVICE | + MLX5_USER_CMDS_SUPP_UHW_CREATE_AH; resp.response_length += sizeof(resp.cmds_supp_uhw); } @@ -2975,6 +2976,8 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev) (1ull << IB_USER_VERBS_CMD_QUERY_PORT) | (1ull << IB_USER_VERBS_CMD_ALLOC_PD) | (1ull << IB_USER_VERBS_CMD_DEALLOC_PD) | + (1ull << IB_USER_VERBS_CMD_CREATE_AH) | + (1ull << IB_USER_VERBS_CMD_DESTROY_AH) | (1ull << IB_USER_VERBS_CMD_REG_MR) | (1ull << IB_USER_VERBS_CMD_REREG_MR) | (1ull << IB_USER_VERBS_CMD_DEREG_MR) | Modified: stable/11/sys/dev/mthca/mthca_provider.c ============================================================================== --- stable/11/sys/dev/mthca/mthca_provider.c Fri Mar 30 18:33:30 2018 (r331783) +++ stable/11/sys/dev/mthca/mthca_provider.c Fri Mar 30 18:36:44 2018 (r331784) @@ -409,7 +409,8 @@ static int mthca_dealloc_pd(struct ib_pd *pd) } static struct ib_ah *mthca_ah_create(struct ib_pd *pd, - struct ib_ah_attr *ah_attr) + struct ib_ah_attr *ah_attr, + struct ib_udata *udata) { int err; struct mthca_ah *ah; Modified: stable/11/sys/ofed/drivers/infiniband/core/core_priv.h ============================================================================== --- stable/11/sys/ofed/drivers/infiniband/core/core_priv.h Fri Mar 30 18:33:30 2018 (r331783) +++ stable/11/sys/ofed/drivers/infiniband/core/core_priv.h Fri Mar 30 18:36:44 2018 (r331784) @@ -76,9 +76,6 @@ void ib_device_unregister_sysfs(struct ib_device *devi void ib_cache_setup(void); void ib_cache_cleanup(void); -int ib_resolve_eth_dmac(struct ib_qp *qp, - struct ib_qp_attr *qp_attr, int *qp_attr_mask); - typedef void (*roce_netdev_callback)(struct ib_device *device, u8 port, struct net_device *idev, void *cookie); Modified: stable/11/sys/ofed/drivers/infiniband/core/ib_uverbs_cmd.c ============================================================================== --- stable/11/sys/ofed/drivers/infiniband/core/ib_uverbs_cmd.c Fri Mar 30 18:33:30 2018 (r331783) +++ stable/11/sys/ofed/drivers/infiniband/core/ib_uverbs_cmd.c Fri Mar 30 18:36:44 2018 (r331784) @@ -2409,9 +2409,11 @@ ssize_t ib_uverbs_modify_qp(struct ib_uverbs_file *fil attr->alt_ah_attr.port_num = cmd.alt_dest.port_num; if (qp->real_qp == qp) { - ret = ib_resolve_eth_dmac(qp, attr, &cmd.attr_mask); - if (ret) - goto release_qp; + if (cmd.attr_mask & IB_QP_AV) { + ret = ib_resolve_eth_dmac(qp->device, &attr->ah_attr); + if (ret) + goto release_qp; + } ret = qp->device->modify_qp(qp, attr, modify_qp_mask(qp->qp_type, cmd.attr_mask), &udata); } else { @@ -2882,6 +2884,7 @@ ssize_t ib_uverbs_create_ah(struct ib_uverbs_file *fil struct ib_ah *ah; struct ib_ah_attr attr; int ret; + struct ib_udata udata; if (out_len < sizeof resp) return -ENOSPC; @@ -2889,6 +2892,10 @@ ssize_t ib_uverbs_create_ah(struct ib_uverbs_file *fil if (copy_from_user(&cmd, buf, sizeof cmd)) return -EFAULT; + INIT_UDATA(&udata, buf + sizeof(cmd), + (unsigned long)cmd.response + sizeof(resp), + in_len - sizeof(cmd), out_len - sizeof(resp)); + uobj = kmalloc(sizeof *uobj, GFP_KERNEL); if (!uobj) return -ENOMEM; @@ -2915,12 +2922,16 @@ ssize_t ib_uverbs_create_ah(struct ib_uverbs_file *fil memset(&attr.dmac, 0, sizeof(attr.dmac)); memcpy(attr.grh.dgid.raw, cmd.attr.grh.dgid, 16); - ah = ib_create_ah(pd, &attr); + ah = pd->device->create_ah(pd, &attr, &udata); + if (IS_ERR(ah)) { ret = PTR_ERR(ah); goto err_put; } + ah->device = pd->device; + ah->pd = pd; + atomic_inc(&pd->usecnt); ah->uobject = uobj; uobj->object = ah; Modified: stable/11/sys/ofed/drivers/infiniband/core/ib_verbs.c ============================================================================== --- stable/11/sys/ofed/drivers/infiniband/core/ib_verbs.c Fri Mar 30 18:33:30 2018 (r331783) +++ stable/11/sys/ofed/drivers/infiniband/core/ib_verbs.c Fri Mar 30 18:36:44 2018 (r331784) @@ -321,7 +321,7 @@ struct ib_ah *ib_create_ah(struct ib_pd *pd, struct ib { struct ib_ah *ah; - ah = pd->device->create_ah(pd, ah_attr); + ah = pd->device->create_ah(pd, ah_attr, NULL); if (!IS_ERR(ah)) { ah->device = pd->device; @@ -1161,47 +1161,45 @@ int ib_modify_qp_is_ok(enum ib_qp_state cur_state, enu } EXPORT_SYMBOL(ib_modify_qp_is_ok); -int ib_resolve_eth_dmac(struct ib_qp *qp, - struct ib_qp_attr *qp_attr, int *qp_attr_mask) +int ib_resolve_eth_dmac(struct ib_device *device, + struct ib_ah_attr *ah_attr) { int ret = 0; - if (*qp_attr_mask & IB_QP_AV) { - if (qp_attr->ah_attr.port_num < rdma_start_port(qp->device) || - qp_attr->ah_attr.port_num > rdma_end_port(qp->device)) - return -EINVAL; + if (ah_attr->port_num < rdma_start_port(device) || + ah_attr->port_num > rdma_end_port(device)) + return -EINVAL; - if (!rdma_cap_eth_ah(qp->device, qp_attr->ah_attr.port_num)) - return 0; + if (!rdma_cap_eth_ah(device, ah_attr->port_num)) + return 0; - if (rdma_link_local_addr((struct in6_addr *)qp_attr->ah_attr.grh.dgid.raw)) { - rdma_get_ll_mac((struct in6_addr *)qp_attr->ah_attr.grh.dgid.raw, - qp_attr->ah_attr.dmac); - } else { - union ib_gid sgid; - struct ib_gid_attr sgid_attr; - int hop_limit; + if (rdma_link_local_addr((struct in6_addr *)ah_attr->grh.dgid.raw)) { + rdma_get_ll_mac((struct in6_addr *)ah_attr->grh.dgid.raw, + ah_attr->dmac); + } else { + union ib_gid sgid; + struct ib_gid_attr sgid_attr; + int hop_limit; - ret = ib_query_gid(qp->device, - qp_attr->ah_attr.port_num, - qp_attr->ah_attr.grh.sgid_index, - &sgid, &sgid_attr); + ret = ib_query_gid(device, + ah_attr->port_num, + ah_attr->grh.sgid_index, + &sgid, &sgid_attr); - if (ret || !sgid_attr.ndev) { - if (!ret) - ret = -ENXIO; - goto out; - } + if (ret || !sgid_attr.ndev) { + if (!ret) + ret = -ENXIO; + goto out; + } - ret = rdma_addr_find_l2_eth_by_grh(&sgid, - &qp_attr->ah_attr.grh.dgid, - qp_attr->ah_attr.dmac, - sgid_attr.ndev, &hop_limit); + ret = rdma_addr_find_l2_eth_by_grh(&sgid, + &ah_attr->grh.dgid, + ah_attr->dmac, + sgid_attr.ndev, &hop_limit); - dev_put(sgid_attr.ndev); + dev_put(sgid_attr.ndev); - qp_attr->ah_attr.grh.hop_limit = hop_limit; - } + ah_attr->grh.hop_limit = hop_limit; } out: return ret; @@ -1213,11 +1211,13 @@ int ib_modify_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr, int qp_attr_mask) { - int ret; + if (qp_attr_mask & IB_QP_AV) { + int ret; - ret = ib_resolve_eth_dmac(qp, qp_attr, &qp_attr_mask); - if (ret) - return ret; + ret = ib_resolve_eth_dmac(qp->device, &qp_attr->ah_attr); + if (ret) + return ret; + } return qp->device->modify_qp(qp->real_qp, qp_attr, qp_attr_mask, NULL); } Modified: stable/11/sys/ofed/include/rdma/ib_verbs.h ============================================================================== --- stable/11/sys/ofed/include/rdma/ib_verbs.h Fri Mar 30 18:33:30 2018 (r331783) +++ stable/11/sys/ofed/include/rdma/ib_verbs.h Fri Mar 30 18:36:44 2018 (r331784) @@ -1940,7 +1940,8 @@ struct ib_device { struct ib_udata *udata); int (*dealloc_pd)(struct ib_pd *pd); struct ib_ah * (*create_ah)(struct ib_pd *pd, - struct ib_ah_attr *ah_attr); + struct ib_ah_attr *ah_attr, + struct ib_udata *udata); int (*modify_ah)(struct ib_ah *ah, struct ib_ah_attr *ah_attr); int (*query_ah)(struct ib_ah *ah, @@ -3368,4 +3369,7 @@ int ib_sg_to_pages(struct ib_mr *mr, struct scatterlis void ib_drain_rq(struct ib_qp *qp); void ib_drain_sq(struct ib_qp *qp); void ib_drain_qp(struct ib_qp *qp); + +int ib_resolve_eth_dmac(struct ib_device *device, + struct ib_ah_attr *ah_attr); #endif /* IB_VERBS_H */ Modified: stable/11/sys/ofed/include/uapi/rdma/mlx5-abi.h ============================================================================== --- stable/11/sys/ofed/include/uapi/rdma/mlx5-abi.h Fri Mar 30 18:33:30 2018 (r331783) +++ stable/11/sys/ofed/include/uapi/rdma/mlx5-abi.h Fri Mar 30 18:36:44 2018 (r331784) @@ -90,6 +90,7 @@ enum mlx5_ib_alloc_ucontext_resp_mask { enum mlx5_user_cmds_supp_uhw { MLX5_USER_CMDS_SUPP_UHW_QUERY_DEVICE = 1 << 0, + MLX5_USER_CMDS_SUPP_UHW_CREATE_AH = 1 << 1, }; struct mlx5_ib_alloc_ucontext_resp { @@ -238,6 +239,12 @@ struct mlx5_ib_create_wq { __u32 flags; __u32 comp_mask; __u32 reserved; +}; + +struct mlx5_ib_create_ah_resp { + __u32 response_length; + __u8 dmac[ETH_ALEN]; + __u8 reserved[6]; }; struct mlx5_ib_create_wq_resp {