From owner-svn-src-stable@freebsd.org Fri Mar 30 18:33:31 2018 Return-Path: Delivered-To: svn-src-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4A2C4F55106; Fri, 30 Mar 2018 18:33:31 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id EC64775CC4; Fri, 30 Mar 2018 18:33:30 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id E74054B27; Fri, 30 Mar 2018 18:33:30 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w2UIXUig077356; Fri, 30 Mar 2018 18:33:30 GMT (envelope-from hselasky@FreeBSD.org) Received: (from hselasky@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w2UIXUGD077351; Fri, 30 Mar 2018 18:33:30 GMT (envelope-from hselasky@FreeBSD.org) Message-Id: <201803301833.w2UIXUGD077351@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: hselasky set sender to hselasky@FreeBSD.org using -f From: Hans Petter Selasky Date: Fri, 30 Mar 2018 18:33:30 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r331783 - in stable/11/sys/ofed: drivers/infiniband/core include/rdma X-SVN-Group: stable-11 X-SVN-Commit-Author: hselasky X-SVN-Commit-Paths: in stable/11/sys/ofed: drivers/infiniband/core include/rdma X-SVN-Commit-Revision: 331783 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: SVN commit messages for all the -stable branches of the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Mar 2018 18:33:31 -0000 Author: hselasky Date: Fri Mar 30 18:33:30 2018 New Revision: 331783 URL: https://svnweb.freebsd.org/changeset/base/331783 Log: MFC r330507: Get correct network device when accepting incoming RDMA connections in ibcore. This patch ensures the GID index is always used as a basis of resolving incoming RDMA connections, as compared to the GID value itself. Background: On a per infiniband port basis, the GID identifier is not a unique identifier! This assumption falls apart when VLAN ID, IPv6 scope ID and RoCE type, as supported by RoCE v2, is taken into account. This additional information is stored in the so-called GID attributes and is needed to correctly identify the destination network interface for an incoming connection. Different VLANs are allowed to define the same IPv4 addresses and especially for the default IPv6 link-local addresses or when using so-called containers or jails, this is true. The VNET information for the destination network interface is needed in order to perform the L2 address lookup in the right Virtual Network Stack context. Consequently old functions previously used by RoCE v1, like rdma_addr_find_smac_by_sgid() are impossible to support, because there can be multiple identical GIDs associated with the same infiniband port, and the answer to such a request becomes undefined. This function has been removed. Sponsored by: Mellanox Technologies Modified: stable/11/sys/ofed/drivers/infiniband/core/ib_addr.c stable/11/sys/ofed/drivers/infiniband/core/ib_cma.c stable/11/sys/ofed/drivers/infiniband/core/ib_verbs.c stable/11/sys/ofed/include/rdma/ib_addr.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/ofed/drivers/infiniband/core/ib_addr.c ============================================================================== --- stable/11/sys/ofed/drivers/infiniband/core/ib_addr.c Fri Mar 30 18:32:38 2018 (r331782) +++ stable/11/sys/ofed/drivers/infiniband/core/ib_addr.c Fri Mar 30 18:33:30 2018 (r331783) @@ -151,8 +151,7 @@ int rdma_copy_addr(struct rdma_dev_addr *dev_addr, str EXPORT_SYMBOL(rdma_copy_addr); int rdma_translate_ip(const struct sockaddr *addr, - struct rdma_dev_addr *dev_addr, - u16 *vlan_id) + struct rdma_dev_addr *dev_addr) { struct net_device *dev = NULL; int ret = -EADDRNOTAVAIL; @@ -185,8 +184,6 @@ int rdma_translate_ip(const struct sockaddr *addr, if (dev != NULL) { ret = rdma_copy_addr(dev_addr, dev, NULL); - if (vlan_id) - *vlan_id = rdma_vlan_dev_vlan_id(dev); dev_put(dev); } return ret; @@ -305,6 +302,8 @@ static int addr4_resolve(struct sockaddr_in *src_in, if (rte->rt_ifp->if_flags & IFF_LOOPBACK) { ifp = rte->rt_ifp; dev_hold(ifp); + } else if (addr->bound_dev_if != 0) { + ifp = dev_get_by_index(addr->net, addr->bound_dev_if); } else { ifp = ip_dev_find(addr->net, src_in->sin_addr.s_addr); } @@ -460,6 +459,8 @@ static int addr6_resolve(struct sockaddr_in6 *src_in, if (rte->rt_ifp->if_flags & IFF_LOOPBACK) { ifp = rte->rt_ifp; dev_hold(ifp); + } else if (addr->bound_dev_if != 0) { + ifp = dev_get_by_index(addr->net, addr->bound_dev_if); } else { ifp = ip6_dev_find(addr->net, src_in->sin6_addr); } @@ -551,7 +552,7 @@ static int addr_resolve_neigh(struct ifnet *dev, if (dev->if_flags & IFF_LOOPBACK) { int ret; - ret = rdma_translate_ip(dst_in, addr, NULL); + ret = rdma_translate_ip(dst_in, addr); if (!ret) memcpy(addr->dst_dev_addr, addr->src_dev_addr, MAX_ADDR_LEN); @@ -757,13 +758,12 @@ static void resolve_cb(int status, struct sockaddr *sr int rdma_addr_find_l2_eth_by_grh(const union ib_gid *sgid, const union ib_gid *dgid, - u8 *dmac, u16 *vlan_id, int *if_index, + u8 *dmac, struct net_device *dev, int *hoplimit) { int ret = 0; struct rdma_dev_addr dev_addr; struct resolve_cb_context ctx; - struct net_device *dev; union { struct sockaddr _sockaddr; @@ -771,15 +771,14 @@ int rdma_addr_find_l2_eth_by_grh(const union ib_gid *s struct sockaddr_in6 _sockaddr_in6; } sgid_addr, dgid_addr; - rdma_gid2ip(&sgid_addr._sockaddr, sgid); rdma_gid2ip(&dgid_addr._sockaddr, dgid); memset(&dev_addr, 0, sizeof(dev_addr)); - if (if_index) - dev_addr.bound_dev_if = *if_index; - dev_addr.net = TD_TO_VNET(curthread); + dev_addr.bound_dev_if = dev->if_index; + dev_addr.net = dev_net(dev); + ctx.addr = &dev_addr; init_completion(&ctx.comp); ret = rdma_resolve_ip(&self, &sgid_addr._sockaddr, &dgid_addr._sockaddr, @@ -794,42 +793,11 @@ int rdma_addr_find_l2_eth_by_grh(const union ib_gid *s return ret; memcpy(dmac, dev_addr.dst_dev_addr, ETH_ALEN); - dev = dev_get_by_index(dev_addr.net, dev_addr.bound_dev_if); - if (!dev) - return -ENODEV; - if (if_index) - *if_index = dev_addr.bound_dev_if; - if (vlan_id) - *vlan_id = rdma_vlan_dev_vlan_id(dev); if (hoplimit) *hoplimit = dev_addr.hoplimit; - dev_put(dev); return ret; } EXPORT_SYMBOL(rdma_addr_find_l2_eth_by_grh); - -int rdma_addr_find_smac_by_sgid(union ib_gid *sgid, u8 *smac, u16 *vlan_id) -{ - int ret = 0; - struct rdma_dev_addr dev_addr; - union { - struct sockaddr _sockaddr; - struct sockaddr_in _sockaddr_in; - struct sockaddr_in6 _sockaddr_in6; - } gid_addr; - - rdma_gid2ip(&gid_addr._sockaddr, sgid); - - memset(&dev_addr, 0, sizeof(dev_addr)); - dev_addr.net = TD_TO_VNET(curthread); - ret = rdma_translate_ip(&gid_addr._sockaddr, &dev_addr, vlan_id); - if (ret) - return ret; - - memcpy(smac, dev_addr.src_dev_addr, ETH_ALEN); - return ret; -} -EXPORT_SYMBOL(rdma_addr_find_smac_by_sgid); int addr_init(void) { Modified: stable/11/sys/ofed/drivers/infiniband/core/ib_cma.c ============================================================================== --- stable/11/sys/ofed/drivers/infiniband/core/ib_cma.c Fri Mar 30 18:32:38 2018 (r331782) +++ stable/11/sys/ofed/drivers/infiniband/core/ib_cma.c Fri Mar 30 18:33:30 2018 (r331783) @@ -533,7 +533,7 @@ static int cma_translate_addr(struct sockaddr *addr, s int ret; if (addr->sa_family != AF_IB) { - ret = rdma_translate_ip(addr, dev_addr, NULL); + ret = rdma_translate_ip(addr, dev_addr); } else { cma_translate_ib((struct sockaddr_ib *) addr, dev_addr); ret = 0; @@ -2094,7 +2094,7 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id, mutex_lock_nested(&conn_id->handler_mutex, SINGLE_DEPTH_NESTING); conn_id->state = RDMA_CM_CONNECT; - ret = rdma_translate_ip(laddr, &conn_id->id.route.addr.dev_addr, NULL); + ret = rdma_translate_ip(laddr, &conn_id->id.route.addr.dev_addr); if (ret) { mutex_unlock(&conn_id->handler_mutex); rdma_destroy_id(new_cm_id); Modified: stable/11/sys/ofed/drivers/infiniband/core/ib_verbs.c ============================================================================== --- stable/11/sys/ofed/drivers/infiniband/core/ib_verbs.c Fri Mar 30 18:32:38 2018 (r331782) +++ stable/11/sys/ofed/drivers/infiniband/core/ib_verbs.c Fri Mar 30 18:33:30 2018 (r331783) @@ -483,56 +483,29 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 po return ret; if (rdma_protocol_roce(device, port_num)) { - int if_index; - u16 vlan_id = wc->wc_flags & IB_WC_WITH_VLAN ? + struct ib_gid_attr dgid_attr; + const u16 vlan_id = wc->wc_flags & IB_WC_WITH_VLAN ? wc->vlan_id : 0xffff; - struct net_device *idev; - struct net_device *resolved_dev; if (!(wc->wc_flags & IB_WC_GRH)) return -EPROTOTYPE; - if (!device->get_netdev) - return -EOPNOTSUPP; - - idev = device->get_netdev(device, port_num); - if (!idev) - return -ENODEV; - - /* - * Get network interface index early on. This is - * useful for IPv6 link local addresses: - */ - if_index = idev->if_index; - - ret = rdma_addr_find_l2_eth_by_grh(&dgid, &sgid, - ah_attr->dmac, - wc->wc_flags & IB_WC_WITH_VLAN ? - NULL : &vlan_id, - &if_index, &hoplimit); - if (ret) { - dev_put(idev); + ret = get_sgid_index_from_eth(device, port_num, vlan_id, + &dgid, gid_type, &gid_index); + if (ret) return ret; - } - resolved_dev = dev_get_by_index(&init_net, if_index); - if (resolved_dev->if_flags & IFF_LOOPBACK) { - dev_put(resolved_dev); - resolved_dev = idev; - dev_hold(resolved_dev); - } - rcu_read_lock(); - if (resolved_dev != idev && !rdma_is_upper_dev_rcu(idev, - resolved_dev)) - ret = -EHOSTUNREACH; - rcu_read_unlock(); - dev_put(idev); - dev_put(resolved_dev); + ret = ib_get_cached_gid(device, port_num, gid_index, &dgid, &dgid_attr); if (ret) return ret; - ret = get_sgid_index_from_eth(device, port_num, vlan_id, - &dgid, gid_type, &gid_index); + if (dgid_attr.ndev == NULL) + return -ENODEV; + + ret = rdma_addr_find_l2_eth_by_grh(&dgid, &sgid, ah_attr->dmac, + dgid_attr.ndev, &hoplimit); + + dev_put(dgid_attr.ndev); if (ret) return ret; } @@ -1207,7 +1180,6 @@ int ib_resolve_eth_dmac(struct ib_qp *qp, } else { union ib_gid sgid; struct ib_gid_attr sgid_attr; - int ifindex; int hop_limit; ret = ib_query_gid(qp->device, @@ -1221,12 +1193,10 @@ int ib_resolve_eth_dmac(struct ib_qp *qp, goto out; } - ifindex = sgid_attr.ndev->if_index; - ret = rdma_addr_find_l2_eth_by_grh(&sgid, &qp_attr->ah_attr.grh.dgid, qp_attr->ah_attr.dmac, - NULL, &ifindex, &hop_limit); + sgid_attr.ndev, &hop_limit); dev_put(sgid_attr.ndev); Modified: stable/11/sys/ofed/include/rdma/ib_addr.h ============================================================================== --- stable/11/sys/ofed/include/rdma/ib_addr.h Fri Mar 30 18:32:38 2018 (r331782) +++ stable/11/sys/ofed/include/rdma/ib_addr.h Fri Mar 30 18:33:30 2018 (r331783) @@ -93,10 +93,10 @@ struct rdma_dev_addr { * rdma_translate_ip - Translate a local IP address to an RDMA hardware * address. * - * The dev_addr->net field must be initialized. + * The dev_addr->net and dev_addr->bound_dev_if fields must be initialized. */ int rdma_translate_ip(const struct sockaddr *addr, - struct rdma_dev_addr *dev_addr, u16 *vlan_id); + struct rdma_dev_addr *dev_addr); /** * rdma_resolve_ip - Resolve source and destination IP addresses to @@ -132,10 +132,9 @@ int rdma_copy_addr(struct rdma_dev_addr *dev_addr, str int rdma_addr_size(struct sockaddr *addr); -int rdma_addr_find_smac_by_sgid(union ib_gid *sgid, u8 *smac, u16 *vlan_id); int rdma_addr_find_l2_eth_by_grh(const union ib_gid *sgid, const union ib_gid *dgid, - u8 *smac, u16 *vlan_id, int *if_index, + u8 *smac, struct net_device *dev, int *hoplimit); static inline u16 ib_addr_get_pkey(struct rdma_dev_addr *dev_addr)