From owner-svn-src-all@freebsd.org Fri Jul 10 05:51:42 2015 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DD7653FA2; Fri, 10 Jul 2015 05:51:42 +0000 (UTC) (envelope-from luigi@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id AF39CEFF; Fri, 10 Jul 2015 05:51:42 +0000 (UTC) (envelope-from luigi@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.70]) by repo.freebsd.org (8.14.9/8.14.9) with ESMTP id t6A5pgap050476; Fri, 10 Jul 2015 05:51:42 GMT (envelope-from luigi@FreeBSD.org) Received: (from luigi@localhost) by repo.freebsd.org (8.14.9/8.14.9/Submit) id t6A5paZH050451; Fri, 10 Jul 2015 05:51:36 GMT (envelope-from luigi@FreeBSD.org) Message-Id: <201507100551.t6A5paZH050451@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: luigi set sender to luigi@FreeBSD.org using -f From: Luigi Rizzo Date: Fri, 10 Jul 2015 05:51:36 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r285349 - in head/sys: dev/cxgbe dev/e1000 dev/ixgbe dev/netmap dev/re net X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Jul 2015 05:51:43 -0000 Author: luigi Date: Fri Jul 10 05:51:36 2015 New Revision: 285349 URL: https://svnweb.freebsd.org/changeset/base/285349 Log: Sync netmap sources with the version in our private tree. This commit contains large contributions from Giuseppe Lettieri and Stefano Garzarella, is partly supported by grants from Verisign and Cisco, and brings in the following: - fix zerocopy monitor ports and introduce copying monitor ports (the latter are lower performance but give access to all traffic in parallel with the application) - exclusive open mode, useful to implement solutions that recover from crashes of the main netmap client (suggested by Patrick Kelsey) - revised memory allocator in preparation for the 'passthrough mode' (ptnetmap) recently presented at bsdcan. ptnetmap is described in S. Garzarella, G. Lettieri, L. Rizzo; Virtual device passthrough for high speed VM networking, ACM/IEEE ANCS 2015, Oakland (CA) May 2015 http://info.iet.unipi.it/~luigi/research.html - fix rx CRC handing on ixl - add module dependencies for netmap when building drivers as modules - minor simplifications to device-specific routines (*txsync, *rxsync) - general code cleanup (remove unused variables, introduce macros to access rings and remove duplicate code, Applications do not need to be recompiled, unless of course they want to use the new features (monitors and exclusive open). Those willing to try this code on stable/10 can just update the sys/dev/netmap/*, sys/net/netmap* with the version in HEAD and apply the small patches to individual device drivers. MFC after: 1 month Sponsored by: (partly) Verisign, Cisco Modified: head/sys/dev/cxgbe/t4_main.c head/sys/dev/cxgbe/t4_netmap.c head/sys/dev/e1000/if_em.c head/sys/dev/e1000/if_igb.c head/sys/dev/e1000/if_lem.c head/sys/dev/ixgbe/if_ix.c head/sys/dev/netmap/if_em_netmap.h head/sys/dev/netmap/if_igb_netmap.h head/sys/dev/netmap/if_ixl_netmap.h head/sys/dev/netmap/if_lem_netmap.h head/sys/dev/netmap/if_re_netmap.h head/sys/dev/netmap/if_vtnet_netmap.h head/sys/dev/netmap/ixgbe_netmap.h head/sys/dev/netmap/netmap.c head/sys/dev/netmap/netmap_freebsd.c head/sys/dev/netmap/netmap_generic.c head/sys/dev/netmap/netmap_kern.h head/sys/dev/netmap/netmap_mem2.c head/sys/dev/netmap/netmap_mem2.h head/sys/dev/netmap/netmap_monitor.c head/sys/dev/netmap/netmap_pipe.c head/sys/dev/netmap/netmap_vale.c head/sys/dev/re/if_re.c head/sys/net/netmap.h head/sys/net/netmap_user.h Modified: head/sys/dev/cxgbe/t4_main.c ============================================================================== --- head/sys/dev/cxgbe/t4_main.c Fri Jul 10 05:07:18 2015 (r285348) +++ head/sys/dev/cxgbe/t4_main.c Fri Jul 10 05:51:36 2015 (r285349) @@ -8533,10 +8533,17 @@ static devclass_t cxgbe_devclass, cxl_de DRIVER_MODULE(t4nex, pci, t4_driver, t4_devclass, mod_event, 0); MODULE_VERSION(t4nex, 1); MODULE_DEPEND(t4nex, firmware, 1, 1, 1); +#ifdef DEV_NETMAP +MODULE_DEPEND(t4nex, netmap, 1, 1, 1); +#endif /* DEV_NETMAP */ + DRIVER_MODULE(t5nex, pci, t5_driver, t5_devclass, mod_event, 0); MODULE_VERSION(t5nex, 1); MODULE_DEPEND(t5nex, firmware, 1, 1, 1); +#ifdef DEV_NETMAP +MODULE_DEPEND(t5nex, netmap, 1, 1, 1); +#endif /* DEV_NETMAP */ DRIVER_MODULE(cxgbe, t4nex, cxgbe_driver, cxgbe_devclass, 0, 0); MODULE_VERSION(cxgbe, 1); Modified: head/sys/dev/cxgbe/t4_netmap.c ============================================================================== --- head/sys/dev/cxgbe/t4_netmap.c Fri Jul 10 05:07:18 2015 (r285348) +++ head/sys/dev/cxgbe/t4_netmap.c Fri Jul 10 05:51:36 2015 (r285349) @@ -917,8 +917,6 @@ cxgbe_netmap_txsync(struct netmap_kring kring->nr_hwtail -= kring->nkr_num_slots; } - nm_txsync_finalize(kring); - return (0); } @@ -931,7 +929,7 @@ cxgbe_netmap_rxsync(struct netmap_kring struct port_info *pi = ifp->if_softc; struct adapter *sc = pi->adapter; struct sge_nm_rxq *nm_rxq = &sc->sge.nm_rxq[pi->first_nm_rxq + kring->ring_id]; - u_int const head = nm_rxsync_prologue(kring); + u_int const head = kring->rhead; u_int n; int force_update = (flags & NAF_FORCE_READ) || kring->nr_kflags & NKR_PENDINTR; @@ -993,8 +991,6 @@ cxgbe_netmap_rxsync(struct netmap_kring } } - nm_rxsync_finalize(kring); - return (0); } Modified: head/sys/dev/e1000/if_em.c ============================================================================== --- head/sys/dev/e1000/if_em.c Fri Jul 10 05:07:18 2015 (r285348) +++ head/sys/dev/e1000/if_em.c Fri Jul 10 05:51:36 2015 (r285349) @@ -344,6 +344,9 @@ devclass_t em_devclass; DRIVER_MODULE(em, pci, em_driver, em_devclass, 0, 0); MODULE_DEPEND(em, pci, 1, 1, 1); MODULE_DEPEND(em, ether, 1, 1, 1); +#ifdef DEV_NETMAP +MODULE_DEPEND(em, netmap, 1, 1, 1); +#endif /* DEV_NETMAP */ /********************************************************************* * Tunable default values. Modified: head/sys/dev/e1000/if_igb.c ============================================================================== --- head/sys/dev/e1000/if_igb.c Fri Jul 10 05:07:18 2015 (r285348) +++ head/sys/dev/e1000/if_igb.c Fri Jul 10 05:51:36 2015 (r285349) @@ -322,6 +322,9 @@ static devclass_t igb_devclass; DRIVER_MODULE(igb, pci, igb_driver, igb_devclass, 0, 0); MODULE_DEPEND(igb, pci, 1, 1, 1); MODULE_DEPEND(igb, ether, 1, 1, 1); +#ifdef DEV_NETMAP +MODULE_DEPEND(igb, netmap, 1, 1, 1); +#endif /* DEV_NETMAP */ /********************************************************************* * Tunable default values. Modified: head/sys/dev/e1000/if_lem.c ============================================================================== --- head/sys/dev/e1000/if_lem.c Fri Jul 10 05:07:18 2015 (r285348) +++ head/sys/dev/e1000/if_lem.c Fri Jul 10 05:51:36 2015 (r285349) @@ -286,6 +286,9 @@ extern devclass_t em_devclass; DRIVER_MODULE(lem, pci, lem_driver, em_devclass, 0, 0); MODULE_DEPEND(lem, pci, 1, 1, 1); MODULE_DEPEND(lem, ether, 1, 1, 1); +#ifdef DEV_NETMAP +MODULE_DEPEND(lem, netmap, 1, 1, 1); +#endif /* DEV_NETMAP */ /********************************************************************* * Tunable default values. Modified: head/sys/dev/ixgbe/if_ix.c ============================================================================== --- head/sys/dev/ixgbe/if_ix.c Fri Jul 10 05:07:18 2015 (r285348) +++ head/sys/dev/ixgbe/if_ix.c Fri Jul 10 05:51:36 2015 (r285349) @@ -246,6 +246,9 @@ DRIVER_MODULE(ix, pci, ix_driver, ix_dev MODULE_DEPEND(ix, pci, 1, 1, 1); MODULE_DEPEND(ix, ether, 1, 1, 1); +#ifdef DEV_NETMAP +MODULE_DEPEND(ix, netmap, 1, 1, 1); +#endif /* DEV_NETMAP */ /* ** TUNEABLE PARAMETERS: Modified: head/sys/dev/netmap/if_em_netmap.h ============================================================================== --- head/sys/dev/netmap/if_em_netmap.h Fri Jul 10 05:07:18 2015 (r285348) +++ head/sys/dev/netmap/if_em_netmap.h Fri Jul 10 05:51:36 2015 (r285349) @@ -198,8 +198,6 @@ em_netmap_txsync(struct netmap_kring *kr } } - nm_txsync_finalize(kring); - return 0; } @@ -217,7 +215,7 @@ em_netmap_rxsync(struct netmap_kring *kr u_int nic_i; /* index into the NIC ring */ u_int n; u_int const lim = kring->nkr_num_slots - 1; - u_int const head = nm_rxsync_prologue(kring); + u_int const head = kring->rhead; int force_update = (flags & NAF_FORCE_READ) || kring->nr_kflags & NKR_PENDINTR; /* device-specific */ @@ -303,9 +301,6 @@ em_netmap_rxsync(struct netmap_kring *kr E1000_WRITE_REG(&adapter->hw, E1000_RDT(rxr->me), nic_i); } - /* tell userspace that there might be new packets */ - nm_rxsync_finalize(kring); - return 0; ring_reset: Modified: head/sys/dev/netmap/if_igb_netmap.h ============================================================================== --- head/sys/dev/netmap/if_igb_netmap.h Fri Jul 10 05:07:18 2015 (r285348) +++ head/sys/dev/netmap/if_igb_netmap.h Fri Jul 10 05:51:36 2015 (r285349) @@ -180,8 +180,6 @@ igb_netmap_txsync(struct netmap_kring *k kring->nr_hwtail = nm_prev(netmap_idx_n2k(kring, nic_i), lim); } - nm_txsync_finalize(kring); - return 0; } @@ -199,7 +197,7 @@ igb_netmap_rxsync(struct netmap_kring *k u_int nic_i; /* index into the NIC ring */ u_int n; u_int const lim = kring->nkr_num_slots - 1; - u_int const head = nm_rxsync_prologue(kring); + u_int const head = kring->rhead; int force_update = (flags & NAF_FORCE_READ) || kring->nr_kflags & NKR_PENDINTR; /* device-specific */ @@ -283,9 +281,6 @@ igb_netmap_rxsync(struct netmap_kring *k E1000_WRITE_REG(&adapter->hw, E1000_RDT(rxr->me), nic_i); } - /* tell userspace that there might be new packets */ - nm_rxsync_finalize(kring); - return 0; ring_reset: Modified: head/sys/dev/netmap/if_ixl_netmap.h ============================================================================== --- head/sys/dev/netmap/if_ixl_netmap.h Fri Jul 10 05:07:18 2015 (r285348) +++ head/sys/dev/netmap/if_ixl_netmap.h Fri Jul 10 05:51:36 2015 (r285349) @@ -68,9 +68,14 @@ extern int ixl_rx_miss, ixl_rx_miss_bufs * count packets that might be missed due to lost interrupts. */ SYSCTL_DECL(_dev_netmap); -int ixl_rx_miss, ixl_rx_miss_bufs, ixl_crcstrip; +/* + * The xl driver by default strips CRCs and we do not override it. + */ +int ixl_rx_miss, ixl_rx_miss_bufs, ixl_crcstrip = 1; +#if 0 SYSCTL_INT(_dev_netmap, OID_AUTO, ixl_crcstrip, - CTLFLAG_RW, &ixl_crcstrip, 0, "strip CRC on rx frames"); + CTLFLAG_RW, &ixl_crcstrip, 1, "strip CRC on rx frames"); +#endif SYSCTL_INT(_dev_netmap, OID_AUTO, ixl_rx_miss, CTLFLAG_RW, &ixl_rx_miss, 0, "potentially missed rx intr"); SYSCTL_INT(_dev_netmap, OID_AUTO, ixl_rx_miss_bufs, @@ -268,8 +273,6 @@ ixl_netmap_txsync(struct netmap_kring *k kring->nr_hwtail = nm_prev(netmap_idx_n2k(kring, nic_i), lim); } - nm_txsync_finalize(kring); - return 0; } @@ -297,7 +300,7 @@ ixl_netmap_rxsync(struct netmap_kring *k u_int nic_i; /* index into the NIC ring */ u_int n; u_int const lim = kring->nkr_num_slots - 1; - u_int const head = nm_rxsync_prologue(kring); + u_int const head = kring->rhead; int force_update = (flags & NAF_FORCE_READ) || kring->nr_kflags & NKR_PENDINTR; /* device-specific */ @@ -408,9 +411,6 @@ ixl_netmap_rxsync(struct netmap_kring *k wr32(vsi->hw, rxr->tail, nic_i); } - /* tell userspace that there might be new packets */ - nm_rxsync_finalize(kring); - return 0; ring_reset: Modified: head/sys/dev/netmap/if_lem_netmap.h ============================================================================== --- head/sys/dev/netmap/if_lem_netmap.h Fri Jul 10 05:07:18 2015 (r285348) +++ head/sys/dev/netmap/if_lem_netmap.h Fri Jul 10 05:51:36 2015 (r285349) @@ -302,8 +302,6 @@ lem_netmap_txsync(struct netmap_kring *k kring->nr_hwtail = nm_prev(netmap_idx_n2k(kring, nic_i), lim); } - nm_txsync_finalize(kring); - return 0; } @@ -321,7 +319,7 @@ lem_netmap_rxsync(struct netmap_kring *k u_int nic_i; /* index into the NIC ring */ u_int n; u_int const lim = kring->nkr_num_slots - 1; - u_int const head = nm_rxsync_prologue(kring); + u_int const head = kring->rhead; int force_update = (flags & NAF_FORCE_READ) || kring->nr_kflags & NKR_PENDINTR; /* device-specific */ @@ -466,9 +464,6 @@ lem_netmap_rxsync(struct netmap_kring *k E1000_WRITE_REG(&adapter->hw, E1000_RDT(0), nic_i); } - /* tell userspace that there might be new packets */ - nm_rxsync_finalize(kring); - return 0; ring_reset: Modified: head/sys/dev/netmap/if_re_netmap.h ============================================================================== --- head/sys/dev/netmap/if_re_netmap.h Fri Jul 10 05:07:18 2015 (r285348) +++ head/sys/dev/netmap/if_re_netmap.h Fri Jul 10 05:51:36 2015 (r285349) @@ -159,8 +159,6 @@ re_netmap_txsync(struct netmap_kring *kr } } - nm_txsync_finalize(kring); - return 0; } @@ -178,7 +176,7 @@ re_netmap_rxsync(struct netmap_kring *kr u_int nic_i; /* index into the NIC ring */ u_int n; u_int const lim = kring->nkr_num_slots - 1; - u_int const head = nm_rxsync_prologue(kring); + u_int const head = kring->rhead; int force_update = (flags & NAF_FORCE_READ) || kring->nr_kflags & NKR_PENDINTR; /* device-specific */ @@ -273,9 +271,6 @@ re_netmap_rxsync(struct netmap_kring *kr BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); } - /* tell userspace that there might be new packets */ - nm_rxsync_finalize(kring); - return 0; ring_reset: Modified: head/sys/dev/netmap/if_vtnet_netmap.h ============================================================================== --- head/sys/dev/netmap/if_vtnet_netmap.h Fri Jul 10 05:07:18 2015 (r285348) +++ head/sys/dev/netmap/if_vtnet_netmap.h Fri Jul 10 05:51:36 2015 (r285349) @@ -214,9 +214,6 @@ vtnet_netmap_txsync(struct netmap_kring virtqueue_postpone_intr(vq, VQ_POSTPONE_SHORT); } -//out: - nm_txsync_finalize(kring); - return 0; } @@ -278,7 +275,7 @@ vtnet_netmap_rxsync(struct netmap_kring // u_int nic_i; /* index into the NIC ring */ u_int n; u_int const lim = kring->nkr_num_slots - 1; - u_int const head = nm_rxsync_prologue(kring); + u_int const head = kring->rhead; int force_update = (flags & NAF_FORCE_READ) || kring->nr_kflags & NKR_PENDINTR; /* device-specific */ @@ -340,9 +337,6 @@ vtnet_netmap_rxsync(struct netmap_kring vtnet_rxq_enable_intr(rxq); } - /* tell userspace that there might be new packets. */ - nm_rxsync_finalize(kring); - ND("[C] h %d c %d t %d hwcur %d hwtail %d", ring->head, ring->cur, ring->tail, kring->nr_hwcur, kring->nr_hwtail); Modified: head/sys/dev/netmap/ixgbe_netmap.h ============================================================================== --- head/sys/dev/netmap/ixgbe_netmap.h Fri Jul 10 05:07:18 2015 (r285348) +++ head/sys/dev/netmap/ixgbe_netmap.h Fri Jul 10 05:51:36 2015 (r285349) @@ -322,8 +322,6 @@ ixgbe_netmap_txsync(struct netmap_kring } } - nm_txsync_finalize(kring); - return 0; } @@ -351,7 +349,7 @@ ixgbe_netmap_rxsync(struct netmap_kring u_int nic_i; /* index into the NIC ring */ u_int n; u_int const lim = kring->nkr_num_slots - 1; - u_int const head = nm_rxsync_prologue(kring); + u_int const head = kring->rhead; int force_update = (flags & NAF_FORCE_READ) || kring->nr_kflags & NKR_PENDINTR; /* device-specific */ @@ -458,9 +456,6 @@ ixgbe_netmap_rxsync(struct netmap_kring IXGBE_WRITE_REG(&adapter->hw, IXGBE_RDT(rxr->me), nic_i); } - /* tell userspace that there might be new packets */ - nm_rxsync_finalize(kring); - return 0; ring_reset: Modified: head/sys/dev/netmap/netmap.c ============================================================================== --- head/sys/dev/netmap/netmap.c Fri Jul 10 05:07:18 2015 (r285348) +++ head/sys/dev/netmap/netmap.c Fri Jul 10 05:51:36 2015 (r285349) @@ -293,7 +293,7 @@ ports attached to the switch) * kring->nm_sync() == DEVICE_netmap_rxsync() * 2) device interrupt handler * na->nm_notify() == netmap_notify() - * - tx from host stack + * - rx from host stack * concurrently: * 1) host stack * netmap_transmit() @@ -313,31 +313,113 @@ ports attached to the switch) * * -= SYSTEM DEVICE WITH GENERIC SUPPORT =- * + * na == NA(ifp) == generic_netmap_adapter created in generic_netmap_attach() * - * - * -= VALE PORT =- - * - * - * - * -= NETMAP PIPE =- - * - * - * - * -= SYSTEM DEVICE WITH NATIVE SUPPORT, CONNECTED TO VALE, NO HOST RINGS =- - * - * - * - * -= SYSTEM DEVICE WITH NATIVE SUPPORT, CONNECTED TO VALE, WITH HOST RINGS =- - * - * - * - * -= SYSTEM DEVICE WITH GENERIC SUPPORT, CONNECTED TO VALE, NO HOST RINGS =- - * + * - tx from netmap userspace: + * concurrently: + * 1) ioctl(NIOCTXSYNC)/netmap_poll() in process context + * kring->nm_sync() == generic_netmap_txsync() + * linux: dev_queue_xmit() with NM_MAGIC_PRIORITY_TX + * generic_ndo_start_xmit() + * orig. dev. start_xmit + * FreeBSD: na->if_transmit() == orig. dev if_transmit + * 2) generic_mbuf_destructor() + * na->nm_notify() == netmap_notify() + * - rx from netmap userspace: + * 1) ioctl(NIOCRXSYNC)/netmap_poll() in process context + * kring->nm_sync() == generic_netmap_rxsync() + * mbq_safe_dequeue() + * 2) device driver + * generic_rx_handler() + * mbq_safe_enqueue() + * na->nm_notify() == netmap_notify() + * - rx from host stack: + * concurrently: + * 1) host stack + * linux: generic_ndo_start_xmit() + * netmap_transmit() + * FreeBSD: ifp->if_input() == netmap_transmit + * both: + * na->nm_notify() == netmap_notify() + * 2) ioctl(NIOCRXSYNC)/netmap_poll() in process context + * kring->nm_sync() == netmap_rxsync_from_host_compat + * netmap_rxsync_from_host(na, NULL, NULL) + * - tx to host stack: + * ioctl(NIOCTXSYNC)/netmap_poll() in process context + * kring->nm_sync() == netmap_txsync_to_host_compat + * netmap_txsync_to_host(na) + * NM_SEND_UP() + * FreeBSD: na->if_input() == ??? XXX + * linux: netif_rx() with NM_MAGIC_PRIORITY_RX * * - * -= SYSTEM DEVICE WITH GENERIC SUPPORT, CONNECTED TO VALE, WITH HOST RINGS =- + * -= VALE =- * + * INCOMING: * + * - VALE ports: + * ioctl(NIOCTXSYNC)/netmap_poll() in process context + * kring->nm_sync() == netmap_vp_txsync() + * + * - system device with native support: + * from cable: + * interrupt + * na->nm_notify() == netmap_bwrap_intr_notify(ring_nr != host ring) + * kring->nm_sync() == DEVICE_netmap_rxsync() + * netmap_vp_txsync() + * kring->nm_sync() == DEVICE_netmap_rxsync() + * from host stack: + * netmap_transmit() + * na->nm_notify() == netmap_bwrap_intr_notify(ring_nr == host ring) + * kring->nm_sync() == netmap_rxsync_from_host_compat() + * netmap_vp_txsync() + * + * - system device with generic support: + * from device driver: + * generic_rx_handler() + * na->nm_notify() == netmap_bwrap_intr_notify(ring_nr != host ring) + * kring->nm_sync() == generic_netmap_rxsync() + * netmap_vp_txsync() + * kring->nm_sync() == generic_netmap_rxsync() + * from host stack: + * netmap_transmit() + * na->nm_notify() == netmap_bwrap_intr_notify(ring_nr == host ring) + * kring->nm_sync() == netmap_rxsync_from_host_compat() + * netmap_vp_txsync() + * + * (all cases) --> nm_bdg_flush() + * dest_na->nm_notify() == (see below) + * + * OUTGOING: + * + * - VALE ports: + * concurrently: + * 1) ioctlNIOCRXSYNC)/netmap_poll() in process context + * kring->nm_sync() == netmap_vp_rxsync() + * 2) from nm_bdg_flush() + * na->nm_notify() == netmap_notify() + * + * - system device with native support: + * to cable: + * na->nm_notify() == netmap_bwrap_notify() + * netmap_vp_rxsync() + * kring->nm_sync() == DEVICE_netmap_txsync() + * netmap_vp_rxsync() + * to host stack: + * netmap_vp_rxsync() + * kring->nm_sync() == netmap_txsync_to_host_compat + * netmap_vp_rxsync_locked() + * + * - system device with generic adapter: + * to device driver: + * na->nm_notify() == netmap_bwrap_notify() + * netmap_vp_rxsync() + * kring->nm_sync() == generic_netmap_txsync() + * netmap_vp_rxsync() + * to host stack: + * netmap_vp_rxsync() + * kring->nm_sync() == netmap_txsync_to_host_compat + * netmap_vp_rxsync() * */ @@ -412,15 +494,6 @@ ports attached to the switch) MALLOC_DEFINE(M_NETMAP, "netmap", "Network memory map"); -/* - * The following variables are used by the drivers and replicate - * fields in the global memory pool. They only refer to buffers - * used by physical interfaces. - */ -u_int netmap_total_buffers; -u_int netmap_buf_size; -char *netmap_buffer_base; /* also address of an invalid buffer */ - /* user-controlled variables */ int netmap_verbose; @@ -446,7 +519,6 @@ SYSCTL_INT(_dev_netmap, OID_AUTO, adapti int netmap_flags = 0; /* debug flags */ int netmap_fwd = 0; /* force transparent mode */ -int netmap_mmap_unreg = 0; /* allow mmap of unregistered fds */ /* * netmap_admode selects the netmap mode to use. @@ -464,7 +536,6 @@ int netmap_generic_rings = 1; /* numbe SYSCTL_INT(_dev_netmap, OID_AUTO, flags, CTLFLAG_RW, &netmap_flags, 0 , ""); SYSCTL_INT(_dev_netmap, OID_AUTO, fwd, CTLFLAG_RW, &netmap_fwd, 0 , ""); -SYSCTL_INT(_dev_netmap, OID_AUTO, mmap_unreg, CTLFLAG_RW, &netmap_mmap_unreg, 0, ""); SYSCTL_INT(_dev_netmap, OID_AUTO, admode, CTLFLAG_RW, &netmap_admode, 0 , ""); SYSCTL_INT(_dev_netmap, OID_AUTO, generic_mit, CTLFLAG_RW, &netmap_generic_mit, 0 , ""); SYSCTL_INT(_dev_netmap, OID_AUTO, generic_ringsize, CTLFLAG_RW, &netmap_generic_ringsize, 0 , ""); @@ -472,15 +543,6 @@ SYSCTL_INT(_dev_netmap, OID_AUTO, generi NMG_LOCK_T netmap_global_lock; - -static void -nm_kr_get(struct netmap_kring *kr) -{ - while (NM_ATOMIC_TEST_AND_SET(&kr->nr_busy)) - tsleep(kr, 0, "NM_KR_GET", 4); -} - - /* * mark the ring as stopped, and run through the locks * to make sure other users get to see it. @@ -495,34 +557,14 @@ netmap_disable_ring(struct netmap_kring nm_kr_put(kr); } -/* stop or enable a single tx ring */ -void -netmap_set_txring(struct netmap_adapter *na, u_int ring_id, int stopped) -{ - if (stopped) - netmap_disable_ring(na->tx_rings + ring_id); - else - na->tx_rings[ring_id].nkr_stopped = 0; - /* nofify that the stopped state has changed. This is currently - *only used by bwrap to propagate the state to its own krings. - * (see netmap_bwrap_intr_notify). - */ - na->nm_notify(na, ring_id, NR_TX, NAF_DISABLE_NOTIFY); -} - -/* stop or enable a single rx ring */ +/* stop or enable a single ring */ void -netmap_set_rxring(struct netmap_adapter *na, u_int ring_id, int stopped) +netmap_set_ring(struct netmap_adapter *na, u_int ring_id, enum txrx t, int stopped) { if (stopped) - netmap_disable_ring(na->rx_rings + ring_id); + netmap_disable_ring(NMR(na, t) + ring_id); else - na->rx_rings[ring_id].nkr_stopped = 0; - /* nofify that the stopped state has changed. This is currently - *only used by bwrap to propagate the state to its own krings. - * (see netmap_bwrap_intr_notify). - */ - na->nm_notify(na, ring_id, NR_RX, NAF_DISABLE_NOTIFY); + NMR(na, t)[ring_id].nkr_stopped = 0; } @@ -531,20 +573,15 @@ void netmap_set_all_rings(struct netmap_adapter *na, int stopped) { int i; - u_int ntx, nrx; + enum txrx t; if (!nm_netmap_on(na)) return; - ntx = netmap_real_tx_rings(na); - nrx = netmap_real_rx_rings(na); - - for (i = 0; i < ntx; i++) { - netmap_set_txring(na, i, stopped); - } - - for (i = 0; i < nrx; i++) { - netmap_set_rxring(na, i, stopped); + for_rx_tx(t) { + for (i = 0; i < netmap_real_rings(na, t); i++) { + netmap_set_ring(na, i, t, stopped); + } } } @@ -657,7 +694,8 @@ netmap_update_config(struct netmap_adapt txr = txd = rxr = rxd = 0; if (na->nm_config == NULL || - na->nm_config(na, &txr, &txd, &rxr, &rxd)) { + na->nm_config(na, &txr, &txd, &rxr, &rxd)) + { /* take whatever we had at init time */ txr = na->num_tx_rings; txd = na->num_tx_desc; @@ -738,73 +776,59 @@ netmap_krings_create(struct netmap_adapt { u_int i, len, ndesc; struct netmap_kring *kring; - u_int ntx, nrx; + u_int n[NR_TXRX]; + enum txrx t; /* account for the (possibly fake) host rings */ - ntx = na->num_tx_rings + 1; - nrx = na->num_rx_rings + 1; + n[NR_TX] = na->num_tx_rings + 1; + n[NR_RX] = na->num_rx_rings + 1; - len = (ntx + nrx) * sizeof(struct netmap_kring) + tailroom; + len = (n[NR_TX] + n[NR_RX]) * sizeof(struct netmap_kring) + tailroom; na->tx_rings = malloc((size_t)len, M_DEVBUF, M_NOWAIT | M_ZERO); if (na->tx_rings == NULL) { D("Cannot allocate krings"); return ENOMEM; } - na->rx_rings = na->tx_rings + ntx; + na->rx_rings = na->tx_rings + n[NR_TX]; /* * All fields in krings are 0 except the one initialized below. * but better be explicit on important kring fields. */ - ndesc = na->num_tx_desc; - for (i = 0; i < ntx; i++) { /* Transmit rings */ - kring = &na->tx_rings[i]; - bzero(kring, sizeof(*kring)); - kring->na = na; - kring->ring_id = i; - kring->nkr_num_slots = ndesc; - if (i < na->num_tx_rings) { - kring->nm_sync = na->nm_txsync; - } else if (i == na->num_tx_rings) { - kring->nm_sync = netmap_txsync_to_host_compat; + for_rx_tx(t) { + ndesc = nma_get_ndesc(na, t); + for (i = 0; i < n[t]; i++) { + kring = &NMR(na, t)[i]; + bzero(kring, sizeof(*kring)); + kring->na = na; + kring->ring_id = i; + kring->tx = t; + kring->nkr_num_slots = ndesc; + if (i < nma_get_nrings(na, t)) { + kring->nm_sync = (t == NR_TX ? na->nm_txsync : na->nm_rxsync); + } else if (i == na->num_tx_rings) { + kring->nm_sync = (t == NR_TX ? + netmap_txsync_to_host_compat : + netmap_rxsync_from_host_compat); + } + kring->nm_notify = na->nm_notify; + kring->rhead = kring->rcur = kring->nr_hwcur = 0; + /* + * IMPORTANT: Always keep one slot empty. + */ + kring->rtail = kring->nr_hwtail = (t == NR_TX ? ndesc - 1 : 0); + snprintf(kring->name, sizeof(kring->name) - 1, "%s %s%d", na->name, + nm_txrx2str(t), i); + ND("ktx %s h %d c %d t %d", + kring->name, kring->rhead, kring->rcur, kring->rtail); + mtx_init(&kring->q_lock, (t == NR_TX ? "nm_txq_lock" : "nm_rxq_lock"), NULL, MTX_DEF); + init_waitqueue_head(&kring->si); } - /* - * IMPORTANT: Always keep one slot empty. - */ - kring->rhead = kring->rcur = kring->nr_hwcur = 0; - kring->rtail = kring->nr_hwtail = ndesc - 1; - snprintf(kring->name, sizeof(kring->name) - 1, "%s TX%d", na->name, i); - ND("ktx %s h %d c %d t %d", - kring->name, kring->rhead, kring->rcur, kring->rtail); - mtx_init(&kring->q_lock, "nm_txq_lock", NULL, MTX_DEF); - init_waitqueue_head(&kring->si); - } - - ndesc = na->num_rx_desc; - for (i = 0; i < nrx; i++) { /* Receive rings */ - kring = &na->rx_rings[i]; - bzero(kring, sizeof(*kring)); - kring->na = na; - kring->ring_id = i; - kring->nkr_num_slots = ndesc; - if (i < na->num_rx_rings) { - kring->nm_sync = na->nm_rxsync; - } else if (i == na->num_rx_rings) { - kring->nm_sync = netmap_rxsync_from_host_compat; - } - kring->rhead = kring->rcur = kring->nr_hwcur = 0; - kring->rtail = kring->nr_hwtail = 0; - snprintf(kring->name, sizeof(kring->name) - 1, "%s RX%d", na->name, i); - ND("krx %s h %d c %d t %d", - kring->name, kring->rhead, kring->rcur, kring->rtail); - mtx_init(&kring->q_lock, "nm_rxq_lock", NULL, MTX_DEF); - init_waitqueue_head(&kring->si); + init_waitqueue_head(&na->si[t]); } - init_waitqueue_head(&na->tx_si); - init_waitqueue_head(&na->rx_si); - na->tailroom = na->rx_rings + nrx; + na->tailroom = na->rx_rings + n[NR_RX]; return 0; } @@ -829,6 +853,10 @@ void netmap_krings_delete(struct netmap_adapter *na) { struct netmap_kring *kring = na->tx_rings; + enum txrx t; + + for_rx_tx(t) + netmap_knlist_destroy(&na->si[t]); /* we rely on the krings layout described above */ for ( ; kring != na->tailroom; kring++) { @@ -858,142 +886,35 @@ netmap_hw_krings_delete(struct netmap_ad } -/* create a new netmap_if for a newly registered fd. - * If this is the first registration of the adapter, - * also create the netmap rings and their in-kernel view, - * the netmap krings. - */ -/* call with NMG_LOCK held */ -static struct netmap_if* -netmap_if_new(struct netmap_adapter *na) -{ - struct netmap_if *nifp; - - if (netmap_update_config(na)) { - /* configuration mismatch, report and fail */ - return NULL; - } - - if (na->active_fds) /* already registered */ - goto final; - - /* create and init the krings arrays. - * Depending on the adapter, this may also create - * the netmap rings themselves - */ - if (na->nm_krings_create(na)) - return NULL; - - /* create all missing netmap rings */ - if (netmap_mem_rings_create(na)) - goto cleanup; - -final: - - /* in all cases, create a new netmap if */ - nifp = netmap_mem_if_new(na); - if (nifp == NULL) - goto cleanup; - - return (nifp); - -cleanup: - - if (na->active_fds == 0) { - netmap_mem_rings_delete(na); - na->nm_krings_delete(na); - } - - return NULL; -} - - -/* grab a reference to the memory allocator, if we don't have one already. The - * reference is taken from the netmap_adapter registered with the priv. - */ -/* call with NMG_LOCK held */ -static int -netmap_get_memory_locked(struct netmap_priv_d* p) -{ - struct netmap_mem_d *nmd; - int error = 0; - - if (p->np_na == NULL) { - if (!netmap_mmap_unreg) - return ENODEV; - /* for compatibility with older versions of the API - * we use the global allocator when no interface has been - * registered - */ - nmd = &nm_mem; - } else { - nmd = p->np_na->nm_mem; - } - if (p->np_mref == NULL) { - error = netmap_mem_finalize(nmd, p->np_na); - if (!error) - p->np_mref = nmd; - } else if (p->np_mref != nmd) { - /* a virtual port has been registered, but previous - * syscalls already used the global allocator. - * We cannot continue - */ - error = ENODEV; - } - return error; -} - - -/* call with NMG_LOCK *not* held */ -int -netmap_get_memory(struct netmap_priv_d* p) -{ - int error; - NMG_LOCK(); - error = netmap_get_memory_locked(p); - NMG_UNLOCK(); - return error; -} - - -/* call with NMG_LOCK held */ -static int -netmap_have_memory_locked(struct netmap_priv_d* p) -{ - return p->np_mref != NULL; -} - - -/* call with NMG_LOCK held */ -static void -netmap_drop_memory_locked(struct netmap_priv_d* p) -{ - if (p->np_mref) { - netmap_mem_deref(p->np_mref, p->np_na); - p->np_mref = NULL; - } -} - /* - * Call nm_register(ifp,0) to stop netmap mode on the interface and + * Undo everything that was done in netmap_do_regif(). In particular, + * call nm_register(ifp,0) to stop netmap mode on the interface and * revert to normal operation. - * The second argument is the nifp to work on. In some cases it is - * not attached yet to the netmap_priv_d so we need to pass it as - * a separate argument. */ /* call with NMG_LOCK held */ +static void netmap_unset_ringid(struct netmap_priv_d *); +static void netmap_rel_exclusive(struct netmap_priv_d *); static void -netmap_do_unregif(struct netmap_priv_d *priv, struct netmap_if *nifp) +netmap_do_unregif(struct netmap_priv_d *priv) { struct netmap_adapter *na = priv->np_na; NMG_LOCK_ASSERT(); na->active_fds--; + /* release exclusive use if it was requested on regif */ + netmap_rel_exclusive(priv); if (na->active_fds <= 0) { /* last instance */ if (netmap_verbose) D("deleting last instance for %s", na->name); + +#ifdef WITH_MONITOR + /* walk through all the rings and tell any monitor + * that the port is going to exit netmap mode + */ + netmap_monitor_stop(na); +#endif /* * (TO CHECK) This function is only called * when the last reference to this file descriptor goes @@ -1014,37 +935,33 @@ netmap_do_unregif(struct netmap_priv_d * * XXX The wake up now must happen during *_down(), when * we order all activities to stop. -gl */ - netmap_knlist_destroy(&na->tx_si); - netmap_knlist_destroy(&na->rx_si); - /* delete rings and buffers */ netmap_mem_rings_delete(na); na->nm_krings_delete(na); } + /* possibily decrement counter of tx_si/rx_si users */ + netmap_unset_ringid(priv); /* delete the nifp */ - netmap_mem_if_delete(na, nifp); -} - -/* call with NMG_LOCK held */ -static __inline int -nm_tx_si_user(struct netmap_priv_d *priv) -{ - return (priv->np_na != NULL && - (priv->np_txqlast - priv->np_txqfirst > 1)); + netmap_mem_if_delete(na, priv->np_nifp); + /* drop the allocator */ + netmap_mem_deref(na->nm_mem, na); + /* mark the priv as unregistered */ + priv->np_na = NULL; + priv->np_nifp = NULL; } /* call with NMG_LOCK held */ static __inline int -nm_rx_si_user(struct netmap_priv_d *priv) +nm_si_user(struct netmap_priv_d *priv, enum txrx t) { return (priv->np_na != NULL && - (priv->np_rxqlast - priv->np_rxqfirst > 1)); + (priv->np_qlast[t] - priv->np_qfirst[t] > 1)); } - /* * Destructor of the netmap_priv_d, called when the fd has - * no active open() and mmap(). Also called in error paths. + * no active open() and mmap(). + * Undo all the things done by NIOCREGIF. * * returns 1 if this is the last instance and we can free priv */ @@ -1066,17 +983,8 @@ netmap_dtor_locked(struct netmap_priv_d if (!na) { return 1; //XXX is it correct? } - netmap_do_unregif(priv, priv->np_nifp); - priv->np_nifp = NULL; - netmap_drop_memory_locked(priv); - if (priv->np_na) { - if (nm_tx_si_user(priv)) - na->tx_si_users--; - if (nm_rx_si_user(priv)) - na->rx_si_users--; - netmap_adapter_put(na); - priv->np_na = NULL; - } + netmap_do_unregif(priv); + netmap_adapter_put(na); return 1; } @@ -1148,7 +1056,7 @@ static void netmap_grab_packets(struct netmap_kring *kring, struct mbq *q, int force) { u_int const lim = kring->nkr_num_slots - 1; - u_int const head = kring->ring->head; + u_int const head = kring->rhead; u_int n; struct netmap_adapter *na = kring->na; @@ -1235,7 +1143,6 @@ void netmap_txsync_to_host(struct netmap_adapter *na) { struct netmap_kring *kring = &na->tx_rings[na->num_tx_rings]; - struct netmap_ring *ring = kring->ring; u_int const lim = kring->nkr_num_slots - 1; u_int const head = kring->rhead; struct mbq q; @@ -1246,14 +1153,12 @@ netmap_txsync_to_host(struct netmap_adap * the queue is drained in all cases. */ mbq_init(&q); - ring->cur = head; netmap_grab_packets(kring, &q, 1 /* force */); ND("have %d pkts in queue", mbq_len(&q)); kring->nr_hwcur = head; kring->nr_hwtail = head + lim; if (kring->nr_hwtail > lim) kring->nr_hwtail -= lim + 1; - nm_txsync_finalize(kring); netmap_send_up(na->ifp, &q); } @@ -1281,11 +1186,13 @@ netmap_rxsync_from_host(struct netmap_ad u_int const lim = kring->nkr_num_slots - 1; u_int const head = kring->rhead; int ret = 0; - struct mbq *q = &kring->rx_queue; + struct mbq *q = &kring->rx_queue, fq; (void)pwait; /* disable unused warnings */ (void)td; + mbq_init(&fq); /* fq holds packets to be freed */ + mbq_lock(q); /* First part: import newly received packets */ @@ -1308,7 +1215,7 @@ netmap_rxsync_from_host(struct netmap_ad slot->len = len; slot->flags = kring->nkr_slot_flags; nm_i = nm_next(nm_i, lim); - m_freem(m); + mbq_enqueue(&fq, m); } kring->nr_hwtail = nm_i; } @@ -1323,13 +1230,15 @@ netmap_rxsync_from_host(struct netmap_ad kring->nr_hwcur = head; } *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***