From owner-svn-src-head@freebsd.org Thu Apr 12 07:20:51 2018 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A2489F9BCF1; Thu, 12 Apr 2018 07:20:51 +0000 (UTC) (envelope-from vmaffione@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4249F6A77B; Thu, 12 Apr 2018 07:20:51 +0000 (UTC) (envelope-from vmaffione@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 3CF4227CA5; Thu, 12 Apr 2018 07:20:51 +0000 (UTC) (envelope-from vmaffione@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w3C7KpY3032122; Thu, 12 Apr 2018 07:20:51 GMT (envelope-from vmaffione@FreeBSD.org) Received: (from vmaffione@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w3C7Koe4032111; Thu, 12 Apr 2018 07:20:50 GMT (envelope-from vmaffione@FreeBSD.org) Message-Id: <201804120720.w3C7Koe4032111@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: vmaffione set sender to vmaffione@FreeBSD.org using -f From: Vincenzo Maffione Date: Thu, 12 Apr 2018 07:20:50 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r332423 - in head/sys: conf dev/cxgbe dev/ixgbe dev/ixl dev/netmap dev/re modules/netmap net sys X-SVN-Group: head X-SVN-Commit-Author: vmaffione X-SVN-Commit-Paths: in head/sys: conf dev/cxgbe dev/ixgbe dev/ixl dev/netmap dev/re modules/netmap net sys X-SVN-Commit-Revision: 332423 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Apr 2018 07:20:52 -0000 Author: vmaffione Date: Thu Apr 12 07:20:50 2018 New Revision: 332423 URL: https://svnweb.freebsd.org/changeset/base/332423 Log: netmap: align codebase to the current upstream (commit id 3fb001303718146) Changelist: - Turn tx_rings and rx_rings arrays into arrays of pointers to kring structs. This patch includes fixes for ixv, ixl, ix, re, cxgbe, iflib, vtnet and ptnet drivers to cope with the change. - Generalize the nm_config() callback to accept a struct containing many parameters. - Introduce NKR_FAKERING to support buffers sharing (used for netmap pipes) - Improved API for external VALE modules. - Various bug fixes and improvements to the netmap memory allocator, including support for externally (userspace) allocated memory. - Refactoring of netmap pipes: now linked rings share the same netmap buffers, with a separate set of kring pointers (rhead, rcur, rtail). Buffer swapping does not need to happen anymore. - Large refactoring of the control API towards an extensible solution; the goal is to allow the addition of more commands and extension of existing ones (with new options) without the need of hacks or the risk of running out of configuration space. A new NIOCCTRL ioctl has been added to handle all the requests of the new control API, which cover all the functionalities so far supported. The netmap API bumps from 11 to 12 with this patch. Full backward compatibility is provided for the old control command (NIOCREGIF), by means of a new netmap_legacy module. Many parts of the old netmap.h header has now been moved to netmap_legacy.h (included by netmap.h). Approved by: hrs (mentor) Added: head/sys/dev/netmap/netmap_legacy.c (contents, props changed) head/sys/net/netmap_legacy.h (contents, props changed) Modified: head/sys/conf/files head/sys/dev/cxgbe/t4_netmap.c head/sys/dev/ixgbe/if_ixv.c head/sys/dev/ixl/ixl_pf_main.c head/sys/dev/ixl/ixl_txrx.c head/sys/dev/netmap/if_ptnet.c head/sys/dev/netmap/if_re_netmap.h head/sys/dev/netmap/if_vtnet_netmap.h head/sys/dev/netmap/netmap.c head/sys/dev/netmap/netmap_freebsd.c head/sys/dev/netmap/netmap_generic.c head/sys/dev/netmap/netmap_kern.h head/sys/dev/netmap/netmap_mem2.c head/sys/dev/netmap/netmap_mem2.h head/sys/dev/netmap/netmap_monitor.c head/sys/dev/netmap/netmap_pipe.c head/sys/dev/netmap/netmap_pt.c head/sys/dev/netmap/netmap_vale.c head/sys/dev/re/if_re.c head/sys/modules/netmap/Makefile head/sys/net/iflib.c head/sys/net/netmap.h head/sys/net/netmap_user.h head/sys/net/netmap_virt.h head/sys/sys/param.h Modified: head/sys/conf/files ============================================================================== --- head/sys/conf/files Thu Apr 12 04:11:37 2018 (r332422) +++ head/sys/conf/files Thu Apr 12 07:20:50 2018 (r332423) @@ -2535,6 +2535,7 @@ dev/netmap/netmap_offloadings.c optional netmap dev/netmap/netmap_pipe.c optional netmap dev/netmap/netmap_pt.c optional netmap dev/netmap/netmap_vale.c optional netmap +dev/netmap/netmap_legacy.c optional netmap # compile-with "${NORMAL_C} -Wconversion -Wextra" dev/nfsmb/nfsmb.c optional nfsmb pci dev/nge/if_nge.c optional nge Modified: head/sys/dev/cxgbe/t4_netmap.c ============================================================================== --- head/sys/dev/cxgbe/t4_netmap.c Thu Apr 12 04:11:37 2018 (r332422) +++ head/sys/dev/cxgbe/t4_netmap.c Thu Apr 12 07:20:50 2018 (r332423) @@ -344,7 +344,7 @@ cxgbe_netmap_on(struct adapter *sc, struct vi_info *vi for_each_nm_rxq(vi, i, nm_rxq) { struct irq *irq = &sc->irq[vi->first_intr + i]; - kring = &na->rx_rings[nm_rxq->nid]; + kring = na->rx_rings[nm_rxq->nid]; if (!nm_kring_pending_on(kring) || nm_rxq->iq_cntxt_id != INVALID_NM_RXQ_CNTXT_ID) continue; @@ -375,7 +375,7 @@ cxgbe_netmap_on(struct adapter *sc, struct vi_info *vi } for_each_nm_txq(vi, i, nm_txq) { - kring = &na->tx_rings[nm_txq->nid]; + kring = na->tx_rings[nm_txq->nid]; if (!nm_kring_pending_on(kring) || nm_txq->cntxt_id != INVALID_NM_TXQ_CNTXT_ID) continue; @@ -427,7 +427,7 @@ cxgbe_netmap_off(struct adapter *sc, struct vi_info *v for_each_nm_txq(vi, i, nm_txq) { struct sge_qstat *spg = (void *)&nm_txq->desc[nm_txq->sidx]; - kring = &na->tx_rings[nm_txq->nid]; + kring = na->tx_rings[nm_txq->nid]; if (!nm_kring_pending_off(kring) || nm_txq->cntxt_id == INVALID_NM_TXQ_CNTXT_ID) continue; @@ -445,7 +445,7 @@ cxgbe_netmap_off(struct adapter *sc, struct vi_info *v for_each_nm_rxq(vi, i, nm_rxq) { struct irq *irq = &sc->irq[vi->first_intr + i]; - kring = &na->rx_rings[nm_rxq->nid]; + kring = na->rx_rings[nm_rxq->nid]; if (!nm_kring_pending_off(kring) || nm_rxq->iq_cntxt_id == INVALID_NM_RXQ_CNTXT_ID) continue; @@ -933,7 +933,7 @@ t4_nm_intr(void *arg) struct adapter *sc = vi->pi->adapter; struct ifnet *ifp = vi->ifp; struct netmap_adapter *na = NA(ifp); - struct netmap_kring *kring = &na->rx_rings[nm_rxq->nid]; + struct netmap_kring *kring = na->rx_rings[nm_rxq->nid]; struct netmap_ring *ring = kring->ring; struct iq_desc *d = &nm_rxq->iq_desc[nm_rxq->iq_cidx]; const void *cpl; Modified: head/sys/dev/ixgbe/if_ixv.c ============================================================================== --- head/sys/dev/ixgbe/if_ixv.c Thu Apr 12 04:11:37 2018 (r332422) +++ head/sys/dev/ixgbe/if_ixv.c Thu Apr 12 07:20:50 2018 (r332423) @@ -1450,7 +1450,7 @@ ixv_initialize_receive_units(if_ctx_t ctx) */ if (ifp->if_capenable & IFCAP_NETMAP) { struct netmap_adapter *na = NA(ifp); - struct netmap_kring *kring = &na->rx_rings[j]; + struct netmap_kring *kring = na->rx_rings[j]; int t = na->num_rx_desc - 1 - nm_kr_rxspace(kring); IXGBE_WRITE_REG(hw, IXGBE_VFRDT(rxr->me), t); Modified: head/sys/dev/ixl/ixl_pf_main.c ============================================================================== --- head/sys/dev/ixl/ixl_pf_main.c Thu Apr 12 04:11:37 2018 (r332422) +++ head/sys/dev/ixl/ixl_pf_main.c Thu Apr 12 07:20:50 2018 (r332423) @@ -2240,7 +2240,7 @@ ixl_initialize_vsi(struct ixl_vsi *vsi) /* preserve queue */ if (vsi->ifp->if_capenable & IFCAP_NETMAP) { struct netmap_adapter *na = NA(vsi->ifp); - struct netmap_kring *kring = &na->rx_rings[i]; + struct netmap_kring *kring = na->rx_rings[i]; int t = na->num_rx_desc - 1 - nm_kr_rxspace(kring); wr32(vsi->hw, I40E_QRX_TAIL(que->me), t); } else Modified: head/sys/dev/ixl/ixl_txrx.c ============================================================================== --- head/sys/dev/ixl/ixl_txrx.c Thu Apr 12 04:11:37 2018 (r332422) +++ head/sys/dev/ixl/ixl_txrx.c Thu Apr 12 07:20:50 2018 (r332423) @@ -547,7 +547,7 @@ ixl_init_tx_ring(struct ixl_queue *que) * netmap slot index, si */ if (slot) { - int si = netmap_idx_n2k(&na->tx_rings[que->me], i); + int si = netmap_idx_n2k(na->tx_rings[que->me], i); netmap_load_map(na, buf->tag, buf->map, NMB(na, slot + si)); } #endif /* DEV_NETMAP */ @@ -1214,7 +1214,7 @@ ixl_init_rx_ring(struct ixl_queue *que) * an mbuf, so end the block with a continue; */ if (slot) { - int sj = netmap_idx_n2k(&na->rx_rings[que->me], j); + int sj = netmap_idx_n2k(na->rx_rings[que->me], j); uint64_t paddr; void *addr; Modified: head/sys/dev/netmap/if_ptnet.c ============================================================================== --- head/sys/dev/netmap/if_ptnet.c Thu Apr 12 04:11:37 2018 (r332422) +++ head/sys/dev/netmap/if_ptnet.c Thu Apr 12 07:20:50 2018 (r332423) @@ -210,8 +210,8 @@ static int ptnet_irqs_init(struct ptnet_softc *sc); static void ptnet_irqs_fini(struct ptnet_softc *sc); static uint32_t ptnet_nm_ptctl(if_t ifp, uint32_t cmd); -static int ptnet_nm_config(struct netmap_adapter *na, unsigned *txr, - unsigned *txd, unsigned *rxr, unsigned *rxd); +static int ptnet_nm_config(struct netmap_adapter *na, + struct nm_config_info *info); static void ptnet_update_vnet_hdr(struct ptnet_softc *sc); static int ptnet_nm_register(struct netmap_adapter *na, int onoff); static int ptnet_nm_txsync(struct netmap_kring *kring, int flags); @@ -1104,18 +1104,20 @@ ptnet_nm_ptctl(if_t ifp, uint32_t cmd) } static int -ptnet_nm_config(struct netmap_adapter *na, unsigned *txr, unsigned *txd, - unsigned *rxr, unsigned *rxd) +ptnet_nm_config(struct netmap_adapter *na, struct nm_config_info *info) { struct ptnet_softc *sc = if_getsoftc(na->ifp); - *txr = bus_read_4(sc->iomem, PTNET_IO_NUM_TX_RINGS); - *rxr = bus_read_4(sc->iomem, PTNET_IO_NUM_RX_RINGS); - *txd = bus_read_4(sc->iomem, PTNET_IO_NUM_TX_SLOTS); - *rxd = bus_read_4(sc->iomem, PTNET_IO_NUM_RX_SLOTS); + info->num_tx_rings = bus_read_4(sc->iomem, PTNET_IO_NUM_TX_RINGS); + info->num_rx_rings = bus_read_4(sc->iomem, PTNET_IO_NUM_RX_RINGS); + info->num_tx_descs = bus_read_4(sc->iomem, PTNET_IO_NUM_TX_SLOTS); + info->num_rx_descs = bus_read_4(sc->iomem, PTNET_IO_NUM_RX_SLOTS); + info->rx_buf_maxsize = NETMAP_BUF_SIZE(na); - device_printf(sc->dev, "txr %u, rxr %u, txd %u, rxd %u\n", - *txr, *rxr, *txd, *rxd); + device_printf(sc->dev, "txr %u, rxr %u, txd %u, rxd %u, rxbufsz %u\n", + info->num_tx_rings, info->num_rx_rings, + info->num_tx_descs, info->num_rx_descs, + info->rx_buf_maxsize); return 0; } @@ -1133,9 +1135,9 @@ ptnet_sync_from_csb(struct ptnet_softc *sc, struct net struct netmap_kring *kring; if (i < na->num_tx_rings) { - kring = na->tx_rings + i; + kring = na->tx_rings[i]; } else { - kring = na->rx_rings + i - na->num_tx_rings; + kring = na->rx_rings[i - na->num_tx_rings]; } kring->rhead = kring->ring->head = ptgh->head; kring->rcur = kring->ring->cur = ptgh->cur; @@ -1228,7 +1230,7 @@ ptnet_nm_register(struct netmap_adapter *na, int onoff if (native) { for_rx_tx(t) { for (i = 0; i <= nma_get_nrings(na, t); i++) { - struct netmap_kring *kring = &NMR(na, t)[i]; + struct netmap_kring *kring = NMR(na, t)[i]; if (nm_kring_pending_on(kring)) { kring->nr_mode = NKR_NETMAP_ON; @@ -1243,7 +1245,7 @@ ptnet_nm_register(struct netmap_adapter *na, int onoff nm_clear_native_flags(na); for_rx_tx(t) { for (i = 0; i <= nma_get_nrings(na, t); i++) { - struct netmap_kring *kring = &NMR(na, t)[i]; + struct netmap_kring *kring = NMR(na, t)[i]; if (nm_kring_pending_off(kring)) { kring->nr_mode = NKR_NETMAP_OFF; @@ -1758,7 +1760,7 @@ ptnet_drain_transmit_queue(struct ptnet_queue *pq, uns ptgh = pq->ptgh; pthg = pq->pthg; - kring = na->tx_rings + pq->kring_id; + kring = na->tx_rings[pq->kring_id]; ring = kring->ring; lim = kring->nkr_num_slots - 1; head = ring->head; @@ -2021,7 +2023,7 @@ ptnet_rx_eof(struct ptnet_queue *pq, unsigned int budg struct ptnet_csb_gh *ptgh = pq->ptgh; struct ptnet_csb_hg *pthg = pq->pthg; struct netmap_adapter *na = &sc->ptna->dr.up; - struct netmap_kring *kring = na->rx_rings + pq->kring_id; + struct netmap_kring *kring = na->rx_rings[pq->kring_id]; struct netmap_ring *ring = kring->ring; unsigned int const lim = kring->nkr_num_slots - 1; unsigned int batch_count = 0; Modified: head/sys/dev/netmap/if_re_netmap.h ============================================================================== --- head/sys/dev/netmap/if_re_netmap.h Thu Apr 12 04:11:37 2018 (r332422) +++ head/sys/dev/netmap/if_re_netmap.h Thu Apr 12 07:20:50 2018 (r332423) @@ -304,7 +304,7 @@ re_netmap_tx_init(struct rl_softc *sc) /* l points in the netmap ring, i points in the NIC ring */ for (i = 0; i < n; i++) { uint64_t paddr; - int l = netmap_idx_n2k(&na->tx_rings[0], i); + int l = netmap_idx_n2k(na->tx_rings[0], i); void *addr = PNMB(na, slot + l, &paddr); desc[i].rl_bufaddr_lo = htole32(RL_ADDR_LO(paddr)); @@ -330,11 +330,11 @@ re_netmap_rx_init(struct rl_softc *sc) * Do not release the slots owned by userspace, * and also keep one empty. */ - max_avail = n - 1 - nm_kr_rxspace(&na->rx_rings[0]); + max_avail = n - 1 - nm_kr_rxspace(na->rx_rings[0]); for (nic_i = 0; nic_i < n; nic_i++) { void *addr; uint64_t paddr; - uint32_t nm_i = netmap_idx_n2k(&na->rx_rings[0], nic_i); + uint32_t nm_i = netmap_idx_n2k(na->rx_rings[0], nic_i); addr = PNMB(na, slot + nm_i, &paddr); Modified: head/sys/dev/netmap/if_vtnet_netmap.h ============================================================================== --- head/sys/dev/netmap/if_vtnet_netmap.h Thu Apr 12 04:11:37 2018 (r332422) +++ head/sys/dev/netmap/if_vtnet_netmap.h Thu Apr 12 07:20:50 2018 (r332423) @@ -383,7 +383,7 @@ vtnet_netmap_init_rx_buffers(struct SOFTC_T *sc) if (!nm_native_on(na)) return 0; for (r = 0; r < na->num_rx_rings; r++) { - struct netmap_kring *kring = &na->rx_rings[r]; + struct netmap_kring *kring = na->rx_rings[r]; struct vtnet_rxq *rxq = &sc->vtnet_rxqs[r]; struct virtqueue *vq = rxq->vtnrx_vq; struct netmap_slot* slot; @@ -407,29 +407,6 @@ vtnet_netmap_init_rx_buffers(struct SOFTC_T *sc) return 1; } -/* Update the virtio-net device configurations. Number of queues can - * change dinamically, by 'ethtool --set-channels $IFNAME combined $N'. - * This is actually the only way virtio-net can currently enable - * the multiqueue mode. - * XXX note that we seem to lose packets if the netmap ring has more - * slots than the queue - */ -static int -vtnet_netmap_config(struct netmap_adapter *na, u_int *txr, u_int *txd, - u_int *rxr, u_int *rxd) -{ - struct ifnet *ifp = na->ifp; - struct SOFTC_T *sc = ifp->if_softc; - - *txr = *rxr = sc->vtnet_max_vq_pairs; - *rxd = 512; // sc->vtnet_rx_nmbufs; - *txd = *rxd; // XXX - D("vtnet config txq=%d, txd=%d rxq=%d, rxd=%d", - *txr, *txd, *rxr, *rxd); - - return 0; -} - static void vtnet_netmap_attach(struct SOFTC_T *sc) { @@ -443,7 +420,6 @@ vtnet_netmap_attach(struct SOFTC_T *sc) na.nm_register = vtnet_netmap_reg; na.nm_txsync = vtnet_netmap_txsync; na.nm_rxsync = vtnet_netmap_rxsync; - na.nm_config = vtnet_netmap_config; na.nm_intr = vtnet_netmap_intr; na.num_tx_rings = na.num_rx_rings = sc->vtnet_max_vq_pairs; D("max rings %d", sc->vtnet_max_vq_pairs); Modified: head/sys/dev/netmap/netmap.c ============================================================================== --- head/sys/dev/netmap/netmap.c Thu Apr 12 04:11:37 2018 (r332422) +++ head/sys/dev/netmap/netmap.c Thu Apr 12 07:20:50 2018 (r332423) @@ -262,7 +262,7 @@ ports attached to the switch) * * Any network interface known to the system (including a persistent VALE * port) can be attached to a VALE switch by issuing the - * NETMAP_BDG_ATTACH subcommand. After the attachment, persistent VALE ports + * NETMAP_REQ_VALE_ATTACH command. After the attachment, persistent VALE ports * look exactly like ephemeral VALE ports (as created in step 2 above). The * attachment of other interfaces, instead, requires the creation of a * netmap_bwrap_adapter. Moreover, the attached interface must be put in @@ -591,9 +591,9 @@ void netmap_set_ring(struct netmap_adapter *na, u_int ring_id, enum txrx t, int stopped) { if (stopped) - netmap_disable_ring(NMR(na, t) + ring_id, stopped); + netmap_disable_ring(NMR(na, t)[ring_id], stopped); else - NMR(na, t)[ring_id].nkr_stopped = 0; + NMR(na, t)[ring_id]->nkr_stopped = 0; } @@ -745,39 +745,42 @@ nm_dump_buf(char *p, int len, int lim, char *dst) int netmap_update_config(struct netmap_adapter *na) { - u_int txr, txd, rxr, rxd; + struct nm_config_info info; - txr = txd = rxr = rxd = 0; + bzero(&info, sizeof(info)); if (na->nm_config == NULL || - na->nm_config(na, &txr, &txd, &rxr, &rxd)) - { + na->nm_config(na, &info)) { /* take whatever we had at init time */ - txr = na->num_tx_rings; - txd = na->num_tx_desc; - rxr = na->num_rx_rings; - rxd = na->num_rx_desc; + info.num_tx_rings = na->num_tx_rings; + info.num_tx_descs = na->num_tx_desc; + info.num_rx_rings = na->num_rx_rings; + info.num_rx_descs = na->num_rx_desc; + info.rx_buf_maxsize = na->rx_buf_maxsize; } - if (na->num_tx_rings == txr && na->num_tx_desc == txd && - na->num_rx_rings == rxr && na->num_rx_desc == rxd) + if (na->num_tx_rings == info.num_tx_rings && + na->num_tx_desc == info.num_tx_descs && + na->num_rx_rings == info.num_rx_rings && + na->num_rx_desc == info.num_rx_descs && + na->rx_buf_maxsize == info.rx_buf_maxsize) return 0; /* nothing changed */ - if (netmap_verbose || na->active_fds > 0) { - D("stored config %s: txring %d x %d, rxring %d x %d", - na->name, - na->num_tx_rings, na->num_tx_desc, - na->num_rx_rings, na->num_rx_desc); - D("new config %s: txring %d x %d, rxring %d x %d", - na->name, txr, txd, rxr, rxd); - } if (na->active_fds == 0) { - D("configuration changed (but fine)"); - na->num_tx_rings = txr; - na->num_tx_desc = txd; - na->num_rx_rings = rxr; - na->num_rx_desc = rxd; + D("configuration changed for %s: txring %d x %d, " + "rxring %d x %d, rxbufsz %d", + na->name, na->num_tx_rings, na->num_tx_desc, + na->num_rx_rings, na->num_rx_desc, na->rx_buf_maxsize); + na->num_tx_rings = info.num_tx_rings; + na->num_tx_desc = info.num_tx_descs; + na->num_rx_rings = info.num_rx_rings; + na->num_rx_desc = info.num_rx_descs; + na->rx_buf_maxsize = info.rx_buf_maxsize; return 0; } - D("configuration changed while active, this is bad..."); + D("WARNING: configuration changed for %s while active: " + "txring %d x %d, rxring %d x %d, rxbufsz %d", + na->name, info.num_tx_rings, info.num_tx_descs, + info.num_rx_rings, info.num_rx_descs, + info.rx_buf_maxsize); return 1; } @@ -827,7 +830,9 @@ netmap_krings_create(struct netmap_adapter *na, u_int n[NR_TX] = na->num_tx_rings + 1; n[NR_RX] = na->num_rx_rings + 1; - len = (n[NR_TX] + n[NR_RX]) * sizeof(struct netmap_kring) + tailroom; + len = (n[NR_TX] + n[NR_RX]) * + (sizeof(struct netmap_kring) + sizeof(struct netmap_kring *)) + + tailroom; na->tx_rings = nm_os_malloc((size_t)len); if (na->tx_rings == NULL) { @@ -835,6 +840,14 @@ netmap_krings_create(struct netmap_adapter *na, u_int return ENOMEM; } na->rx_rings = na->tx_rings + n[NR_TX]; + na->tailroom = na->rx_rings + n[NR_RX]; + + /* link the krings in the krings array */ + kring = (struct netmap_kring *)((char *)na->tailroom + tailroom); + for (i = 0; i < n[NR_TX] + n[NR_RX]; i++) { + na->tx_rings[i] = kring; + kring++; + } /* * All fields in krings are 0 except the one initialized below. @@ -843,9 +856,10 @@ netmap_krings_create(struct netmap_adapter *na, u_int for_rx_tx(t) { ndesc = nma_get_ndesc(na, t); for (i = 0; i < n[t]; i++) { - kring = &NMR(na, t)[i]; + kring = NMR(na, t)[i]; bzero(kring, sizeof(*kring)); kring->na = na; + kring->notify_na = na; kring->ring_id = i; kring->tx = t; kring->nkr_num_slots = ndesc; @@ -854,6 +868,8 @@ netmap_krings_create(struct netmap_adapter *na, u_int if (i < nma_get_nrings(na, t)) { kring->nm_sync = (t == NR_TX ? na->nm_txsync : na->nm_rxsync); } else { + if (!(na->na_flags & NAF_HOST_RINGS)) + kring->nr_kflags |= NKR_FAKERING; kring->nm_sync = (t == NR_TX ? netmap_txsync_to_host: netmap_rxsync_from_host); @@ -874,7 +890,6 @@ netmap_krings_create(struct netmap_adapter *na, u_int nm_os_selinfo_init(&na->si[t]); } - na->tailroom = na->rx_rings + n[NR_RX]; return 0; } @@ -885,7 +900,7 @@ netmap_krings_create(struct netmap_adapter *na, u_int void netmap_krings_delete(struct netmap_adapter *na) { - struct netmap_kring *kring = na->tx_rings; + struct netmap_kring **kring = na->tx_rings; enum txrx t; if (na->tx_rings == NULL) { @@ -898,8 +913,8 @@ netmap_krings_delete(struct netmap_adapter *na) /* we rely on the krings layout described above */ for ( ; kring != na->tailroom; kring++) { - mtx_destroy(&kring->q_lock); - nm_os_selinfo_uninit(&kring->si); + mtx_destroy(&(*kring)->q_lock); + nm_os_selinfo_uninit(&(*kring)->si); } nm_os_free(na->tx_rings); na->tx_rings = na->rx_rings = na->tailroom = NULL; @@ -915,7 +930,7 @@ netmap_krings_delete(struct netmap_adapter *na) void netmap_hw_krings_delete(struct netmap_adapter *na) { - struct mbq *q = &na->rx_rings[na->num_rx_rings].rx_queue; + struct mbq *q = &na->rx_rings[na->num_rx_rings]->rx_queue; ND("destroy sw mbq with len %d", mbq_len(q)); mbq_purge(q); @@ -1196,7 +1211,7 @@ nm_may_forward_down(struct netmap_kring *kring, int sy static u_int netmap_sw_to_nic(struct netmap_adapter *na) { - struct netmap_kring *kring = &na->rx_rings[na->num_rx_rings]; + struct netmap_kring *kring = na->rx_rings[na->num_rx_rings]; struct netmap_slot *rxslot = kring->ring->slot; u_int i, rxcur = kring->nr_hwcur; u_int const head = kring->rhead; @@ -1205,7 +1220,7 @@ netmap_sw_to_nic(struct netmap_adapter *na) /* scan rings to find space, then fill as much as possible */ for (i = 0; i < na->num_tx_rings; i++) { - struct netmap_kring *kdst = &na->tx_rings[i]; + struct netmap_kring *kdst = na->tx_rings[i]; struct netmap_ring *rdst = kdst->ring; u_int const dst_lim = kdst->nkr_num_slots - 1; @@ -1443,7 +1458,7 @@ assign_mem: * MUST BE CALLED UNDER NMG_LOCK() * * Get a refcounted reference to a netmap adapter attached - * to the interface specified by nmr. + * to the interface specified by req. * This is always called in the execution of an ioctl(). * * Return ENXIO if the interface specified by the request does @@ -1453,13 +1468,15 @@ assign_mem: * could not be allocated. * If successful, hold a reference to the netmap adapter. * - * If the interface specified by nmr is a system one, also keep + * If the interface specified by req is a system one, also keep * a reference to it and return a valid *ifp. */ int -netmap_get_na(struct nmreq *nmr, struct netmap_adapter **na, - struct ifnet **ifp, struct netmap_mem_d *nmd, int create) +netmap_get_na(struct nmreq_header *hdr, + struct netmap_adapter **na, struct ifnet **ifp, + struct netmap_mem_d *nmd, int create) { + struct nmreq_register *req = (struct nmreq_register *)hdr->nr_body; int error = 0; struct netmap_adapter *ret = NULL; int nmd_ref = 0; @@ -1467,13 +1484,24 @@ netmap_get_na(struct nmreq *nmr, struct netmap_adapter *na = NULL; /* default return value */ *ifp = NULL; + if (hdr->nr_reqtype != NETMAP_REQ_REGISTER) { + return EINVAL; + } + + if (req->nr_mode == NR_REG_PIPE_MASTER || + req->nr_mode == NR_REG_PIPE_SLAVE) { + /* Do not accept deprecated pipe modes. */ + D("Deprecated pipe nr_mode, use xx{yy or xx}yy syntax"); + return EINVAL; + } + NMG_LOCK_ASSERT(); /* if the request contain a memid, try to find the * corresponding memory region */ - if (nmd == NULL && nmr->nr_arg2) { - nmd = netmap_mem_find(nmr->nr_arg2); + if (nmd == NULL && req->nr_mem_id) { + nmd = netmap_mem_find(req->nr_mem_id); if (nmd == NULL) return EINVAL; /* keep the rereference */ @@ -1492,22 +1520,22 @@ netmap_get_na(struct nmreq *nmr, struct netmap_adapter */ /* try to see if this is a ptnetmap port */ - error = netmap_get_pt_host_na(nmr, na, nmd, create); + error = netmap_get_pt_host_na(hdr, na, nmd, create); if (error || *na != NULL) goto out; /* try to see if this is a monitor port */ - error = netmap_get_monitor_na(nmr, na, nmd, create); + error = netmap_get_monitor_na(hdr, na, nmd, create); if (error || *na != NULL) goto out; /* try to see if this is a pipe port */ - error = netmap_get_pipe_na(nmr, na, nmd, create); + error = netmap_get_pipe_na(hdr, na, nmd, create); if (error || *na != NULL) goto out; /* try to see if this is a bridge port */ - error = netmap_get_bdg_na(nmr, na, nmd, create); + error = netmap_get_bdg_na(hdr, na, nmd, create); if (error) goto out; @@ -1520,7 +1548,7 @@ netmap_get_na(struct nmreq *nmr, struct netmap_adapter * This may still be a tap, a veth/epair, or even a * persistent VALE port. */ - *ifp = ifunit_ref(nmr->nr_name); + *ifp = ifunit_ref(hdr->nr_name); if (*ifp == NULL) { error = ENXIO; goto out; @@ -1765,42 +1793,27 @@ netmap_ring_reinit(struct netmap_kring *kring) * */ int -netmap_interp_ringid(struct netmap_priv_d *priv, uint16_t ringid, uint32_t flags) +netmap_interp_ringid(struct netmap_priv_d *priv, uint32_t nr_mode, + uint16_t nr_ringid, uint64_t nr_flags) { struct netmap_adapter *na = priv->np_na; - u_int j, i = ringid & NETMAP_RING_MASK; - u_int reg = flags & NR_REG_MASK; int excluded_direction[] = { NR_TX_RINGS_ONLY, NR_RX_RINGS_ONLY }; enum txrx t; + u_int j; - if (reg == NR_REG_DEFAULT) { - /* convert from old ringid to flags */ - if (ringid & NETMAP_SW_RING) { - reg = NR_REG_SW; - } else if (ringid & NETMAP_HW_RING) { - reg = NR_REG_ONE_NIC; - } else { - reg = NR_REG_ALL_NIC; - } - D("deprecated API, old ringid 0x%x -> ringid %x reg %d", ringid, i, reg); - } - - if ((flags & NR_PTNETMAP_HOST) && ((reg != NR_REG_ALL_NIC && - reg != NR_REG_PIPE_MASTER && reg != NR_REG_PIPE_SLAVE) || - flags & (NR_RX_RINGS_ONLY|NR_TX_RINGS_ONLY))) { + if ((nr_flags & NR_PTNETMAP_HOST) && ((nr_mode != NR_REG_ALL_NIC) || + nr_flags & (NR_RX_RINGS_ONLY|NR_TX_RINGS_ONLY))) { D("Error: only NR_REG_ALL_NIC supported with netmap passthrough"); return EINVAL; } for_rx_tx(t) { - if (flags & excluded_direction[t]) { + if (nr_flags & excluded_direction[t]) { priv->np_qfirst[t] = priv->np_qlast[t] = 0; continue; } - switch (reg) { + switch (nr_mode) { case NR_REG_ALL_NIC: - case NR_REG_PIPE_MASTER: - case NR_REG_PIPE_SLAVE: priv->np_qfirst[t] = 0; priv->np_qlast[t] = nma_get_nrings(na, t); ND("ALL/PIPE: %s %d %d", nm_txrx2str(t), @@ -1812,20 +1825,21 @@ netmap_interp_ringid(struct netmap_priv_d *priv, uint1 D("host rings not supported"); return EINVAL; } - priv->np_qfirst[t] = (reg == NR_REG_SW ? + priv->np_qfirst[t] = (nr_mode == NR_REG_SW ? nma_get_nrings(na, t) : 0); priv->np_qlast[t] = nma_get_nrings(na, t) + 1; - ND("%s: %s %d %d", reg == NR_REG_SW ? "SW" : "NIC+SW", + ND("%s: %s %d %d", nr_mode == NR_REG_SW ? "SW" : "NIC+SW", nm_txrx2str(t), priv->np_qfirst[t], priv->np_qlast[t]); break; case NR_REG_ONE_NIC: - if (i >= na->num_tx_rings && i >= na->num_rx_rings) { - D("invalid ring id %d", i); + if (nr_ringid >= na->num_tx_rings && + nr_ringid >= na->num_rx_rings) { + D("invalid ring id %d", nr_ringid); return EINVAL; } /* if not enough rings, use the first one */ - j = i; + j = nr_ringid; if (j >= nma_get_nrings(na, t)) j = 0; priv->np_qfirst[t] = j; @@ -1834,11 +1848,11 @@ netmap_interp_ringid(struct netmap_priv_d *priv, uint1 priv->np_qfirst[t], priv->np_qlast[t]); break; default: - D("invalid regif type %d", reg); + D("invalid regif type %d", nr_mode); return EINVAL; } } - priv->np_flags = (flags & ~NR_REG_MASK) | reg; + priv->np_flags = nr_flags | nr_mode; // TODO /* Allow transparent forwarding mode in the host --> nic * direction only if all the TX hw rings have been opened. */ @@ -1854,7 +1868,7 @@ netmap_interp_ringid(struct netmap_priv_d *priv, uint1 priv->np_qlast[NR_TX], priv->np_qfirst[NR_RX], priv->np_qlast[NR_RX], - i); + nr_ringid); } return 0; } @@ -1865,18 +1879,19 @@ netmap_interp_ringid(struct netmap_priv_d *priv, uint1 * for all rings is the same as a single ring. */ static int -netmap_set_ringid(struct netmap_priv_d *priv, uint16_t ringid, uint32_t flags) +netmap_set_ringid(struct netmap_priv_d *priv, uint32_t nr_mode, + uint16_t nr_ringid, uint64_t nr_flags) { struct netmap_adapter *na = priv->np_na; int error; enum txrx t; - error = netmap_interp_ringid(priv, ringid, flags); + error = netmap_interp_ringid(priv, nr_mode, nr_ringid, nr_flags); if (error) { return error; } - priv->np_txpoll = (ringid & NETMAP_NO_TX_POLL) ? 0 : 1; + priv->np_txpoll = (nr_flags & NR_NO_TX_POLL) ? 0 : 1; /* optimization: count the users registered for more than * one ring, which are the ones sleeping on the global queue. @@ -1933,7 +1948,7 @@ netmap_krings_get(struct netmap_priv_d *priv) */ for_rx_tx(t) { for (i = priv->np_qfirst[t]; i < priv->np_qlast[t]; i++) { - kring = &NMR(na, t)[i]; + kring = NMR(na, t)[i]; if ((kring->nr_kflags & NKR_EXCLUSIVE) || (kring->users && excl)) { @@ -1948,7 +1963,7 @@ netmap_krings_get(struct netmap_priv_d *priv) */ for_rx_tx(t) { for (i = priv->np_qfirst[t]; i < priv->np_qlast[t]; i++) { - kring = &NMR(na, t)[i]; + kring = NMR(na, t)[i]; kring->users++; if (excl) kring->nr_kflags |= NKR_EXCLUSIVE; @@ -1979,10 +1994,9 @@ netmap_krings_put(struct netmap_priv_d *priv) priv->np_qfirst[NR_RX], priv->np_qlast[MR_RX]); - for_rx_tx(t) { for (i = priv->np_qfirst[t]; i < priv->np_qlast[t]; i++) { - kring = &NMR(na, t)[i]; + kring = NMR(na, t)[i]; if (excl) kring->nr_kflags &= ~NKR_EXCLUSIVE; kring->users--; @@ -1992,6 +2006,12 @@ netmap_krings_put(struct netmap_priv_d *priv) } } +static int +nm_priv_rx_enabled(struct netmap_priv_d *priv) +{ + return (priv->np_qfirst[NR_RX] != priv->np_qlast[NR_RX]); +} + /* * possibly move the interface to netmap-mode. * If success it returns a pointer to netmap_if, otherwise NULL. @@ -2064,16 +2084,14 @@ netmap_krings_put(struct netmap_priv_d *priv) */ int netmap_do_regif(struct netmap_priv_d *priv, struct netmap_adapter *na, - uint16_t ringid, uint32_t flags) + uint32_t nr_mode, uint16_t nr_ringid, uint64_t nr_flags) { struct netmap_if *nifp = NULL; int error; NMG_LOCK_ASSERT(); - /* ring configuration may have changed, fetch from the card */ - netmap_update_config(na); priv->np_na = na; /* store the reference */ - error = netmap_set_ringid(priv, ringid, flags); + error = netmap_set_ringid(priv, nr_mode, nr_ringid, nr_flags); if (error) goto err; error = netmap_mem_finalize(na->nm_mem, na); @@ -2081,27 +2099,38 @@ netmap_do_regif(struct netmap_priv_d *priv, struct net goto err; if (na->active_fds == 0) { + + /* cache the allocator info in the na */ + error = netmap_mem_get_lut(na->nm_mem, &na->na_lut); + if (error) + goto err_drop_mem; + ND("lut %p bufs %u size %u", na->na_lut.lut, na->na_lut.objtotal, + na->na_lut.objsize); + + /* ring configuration may have changed, fetch from the card */ + netmap_update_config(na); + /* * If this is the first registration of the adapter, * perform sanity checks and create the in-kernel view * of the netmap rings (the netmap krings). */ - if (na->ifp) { + if (na->ifp && nm_priv_rx_enabled(priv)) { /* This netmap adapter is attached to an ifnet. */ unsigned nbs = netmap_mem_bufsize(na->nm_mem); unsigned mtu = nm_os_ifnet_mtu(na->ifp); - /* The maximum amount of bytes that a single - * receive or transmit NIC descriptor can hold. */ - unsigned hw_max_slot_len = 4096; - if (mtu <= hw_max_slot_len) { + ND("mtu %d rx_buf_maxsize %d netmap_buf_size %d", + mtu, na->rx_buf_maxsize, nbs); + + if (mtu <= na->rx_buf_maxsize) { /* The MTU fits a single NIC slot. We only * Need to check that netmap buffers are * large enough to hold an MTU. NS_MOREFRAG * cannot be used in this case. */ if (nbs < mtu) { nm_prerr("error: netmap buf size (%u) " - "< device MTU (%u)", nbs, mtu); + "< device MTU (%u)\n", nbs, mtu); error = EINVAL; goto err_drop_mem; } @@ -2114,22 +2143,22 @@ netmap_do_regif(struct netmap_priv_d *priv, struct net if (!(na->na_flags & NAF_MOREFRAG)) { nm_prerr("error: large MTU (%d) needed " "but %s does not support " - "NS_MOREFRAG", mtu, + "NS_MOREFRAG\n", mtu, na->ifp->if_xname); error = EINVAL; goto err_drop_mem; - } else if (nbs < hw_max_slot_len) { + } else if (nbs < na->rx_buf_maxsize) { nm_prerr("error: using NS_MOREFRAG on " "%s requires netmap buf size " - ">= %u", na->ifp->if_xname, - hw_max_slot_len); + ">= %u\n", na->ifp->if_xname, + na->rx_buf_maxsize); error = EINVAL; goto err_drop_mem; } else { nm_prinf("info: netmap application on " "%s needs to support " "NS_MOREFRAG " - "(MTU=%u,netmap_buf_size=%u)", + "(MTU=%u,netmap_buf_size=%u)\n", na->ifp->if_xname, mtu, nbs); } } @@ -2141,7 +2170,7 @@ netmap_do_regif(struct netmap_priv_d *priv, struct net */ error = na->nm_krings_create(na); if (error) - goto err_drop_mem; + goto err_put_lut; } @@ -2165,21 +2194,12 @@ netmap_do_regif(struct netmap_priv_d *priv, struct net goto err_del_rings; } - if (na->active_fds == 0) { - /* cache the allocator info in the na */ - error = netmap_mem_get_lut(na->nm_mem, &na->na_lut); - if (error) - goto err_del_if; - ND("lut %p bufs %u size %u", na->na_lut.lut, na->na_lut.objtotal, - na->na_lut.objsize); - } - if (nm_kring_pending(priv)) { /* Some kring is switching mode, tell the adapter to * react on this. */ error = na->nm_register(na, 1); if (error) - goto err_put_lut; + goto err_del_if; } /* Commit the reference. */ @@ -2195,9 +2215,6 @@ netmap_do_regif(struct netmap_priv_d *priv, struct net return 0; -err_put_lut: - if (na->active_fds == 0) - memset(&na->na_lut, 0, sizeof(na->na_lut)); err_del_if: netmap_mem_if_delete(na, nifp); err_del_rings: @@ -2207,6 +2224,9 @@ err_rel_excl: err_del_krings: if (na->active_fds == 0) na->nm_krings_delete(na); +err_put_lut: + if (na->active_fds == 0) + memset(&na->na_lut, 0, sizeof(na->na_lut)); err_drop_mem: netmap_mem_drop(na); err: @@ -2242,246 +2262,367 @@ ring_timestamp_set(struct netmap_ring *ring) } } +static int nmreq_copyin(struct nmreq_header *, int); +static int nmreq_copyout(struct nmreq_header *, int); +static int nmreq_checkoptions(struct nmreq_header *); /* * ioctl(2) support for the "netmap" device. * * Following a list of accepted commands: - * - NIOCGINFO + * - NIOCCTRL device control API + * - NIOCTXSYNC sync TX rings + * - NIOCRXSYNC sync RX rings * - SIOCGIFADDR just for convenience - * - NIOCREGIF - * - NIOCTXSYNC - * - NIOCRXSYNC + * - NIOCGINFO deprecated (legacy API) + * - NIOCREGIF deprecated (legacy API) * * Return 0 on success, errno otherwise. */ int -netmap_ioctl(struct netmap_priv_d *priv, u_long cmd, caddr_t data, struct thread *td) +netmap_ioctl(struct netmap_priv_d *priv, u_long cmd, caddr_t data, + struct thread *td, int nr_body_is_user) { struct mbq q; /* packets from RX hw queues to host stack */ - struct nmreq *nmr = (struct nmreq *) data; struct netmap_adapter *na = NULL; struct netmap_mem_d *nmd = NULL; struct ifnet *ifp = NULL; int error = 0; u_int i, qfirst, qlast; struct netmap_if *nifp; - struct netmap_kring *krings; + struct netmap_kring **krings; int sync_flags; enum txrx t; - if (cmd == NIOCGINFO || cmd == NIOCREGIF) { - /* truncate name */ - nmr->nr_name[sizeof(nmr->nr_name) - 1] = '\0'; - if (nmr->nr_version != NETMAP_API) { - D("API mismatch for %s got %d need %d", - nmr->nr_name, - nmr->nr_version, NETMAP_API); - nmr->nr_version = NETMAP_API; + switch (cmd) { + case NIOCCTRL: { + struct nmreq_header *hdr = (struct nmreq_header *)data; + + if (hdr->nr_version != NETMAP_API) { + D("API mismatch for reqtype %d: got %d need %d", + hdr->nr_version, + hdr->nr_version, NETMAP_API); + hdr->nr_version = NETMAP_API; } - if (nmr->nr_version < NETMAP_MIN_API || - nmr->nr_version > NETMAP_MAX_API) { + if (hdr->nr_version < NETMAP_MIN_API || + hdr->nr_version > NETMAP_MAX_API) { return EINVAL; } - } - switch (cmd) { - case NIOCGINFO: /* return capabilities etc */ - if (nmr->nr_cmd == NETMAP_BDG_LIST) { - error = netmap_bdg_ctl(nmr, NULL); - break; + /* Make a kernel-space copy of the user-space nr_body. + * For convenince, the nr_body pointer and the pointers + * in the options list will be replaced with their + * kernel-space counterparts. The original pointers are + * saved internally and later restored by nmreq_copyout + */ + error = nmreq_copyin(hdr, nr_body_is_user); + if (error) { + return error; } - NMG_LOCK(); - do { - /* memsize is always valid */ - u_int memflags; - uint64_t memsize; + /* Sanitize hdr->nr_name. */ + hdr->nr_name[sizeof(hdr->nr_name) - 1] = '\0'; - if (nmr->nr_name[0] != '\0') { + switch (hdr->nr_reqtype) { + case NETMAP_REQ_REGISTER: { + struct nmreq_register *req = + (struct nmreq_register *)hdr->nr_body; + /* Protect access to priv from concurrent requests. */ + NMG_LOCK(); + do { + u_int memflags; +#ifdef WITH_EXTMEM + struct nmreq_option *opt; +#endif /* WITH_EXTMEM */ - /* get a refcount */ - error = netmap_get_na(nmr, &na, &ifp, NULL, 1 /* create */); + if (priv->np_nifp != NULL) { /* thread already registered */ + error = EBUSY; + break; + } + +#ifdef WITH_EXTMEM + opt = nmreq_findoption((struct nmreq_option *)hdr->nr_options, + NETMAP_REQ_OPT_EXTMEM); + if (opt != NULL) { + struct nmreq_opt_extmem *e = + (struct nmreq_opt_extmem *)opt; + + error = nmreq_checkduplicate(opt); + if (error) { + opt->nro_status = error; + break; + } + nmd = netmap_mem_ext_create(e->nro_usrptr, + &e->nro_info, &error); + opt->nro_status = error; + if (nmd == NULL) + break; + } +#endif /* WITH_EXTMEM */ + + if (nmd == NULL && req->nr_mem_id) { + /* find the allocator and get a reference */ + nmd = netmap_mem_find(req->nr_mem_id); + if (nmd == NULL) { + error = EINVAL; + break; + } + } + /* find the interface and a reference */ + error = netmap_get_na(hdr, &na, &ifp, nmd, + 1 /* create */); /* keep reference */ *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***