From owner-svn-src-stable-12@freebsd.org Tue Sep 3 14:06:01 2019 Return-Path: Delivered-To: svn-src-stable-12@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 10E7ADC1D4; Tue, 3 Sep 2019 14:06:01 +0000 (UTC) (envelope-from yuripv@freebsd.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2610:1c1:1:6074::16:84]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "freefall.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 46N7yw249dz4P5N; Tue, 3 Sep 2019 14:06:00 +0000 (UTC) (envelope-from yuripv@freebsd.org) Received: by freefall.freebsd.org (Postfix, from userid 1452) id 19E6819F73; Tue, 3 Sep 2019 14:05:54 +0000 (UTC) X-Original-To: yuripv@localmail.freebsd.org Delivered-To: yuripv@localmail.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (Client CN "mx1.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by freefall.freebsd.org (Postfix) with ESMTPS id 01F3DA4C8; Mon, 1 Apr 2019 10:51:30 +0000 (UTC) (envelope-from owner-src-committers@freebsd.org) Received: from freefall.freebsd.org (freefall.freebsd.org [96.47.72.132]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "freefall.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 39A3869D7E; Mon, 1 Apr 2019 10:51:30 +0000 (UTC) (envelope-from owner-src-committers@freebsd.org) Received: by freefall.freebsd.org (Postfix, from userid 538) id 1D556A498; Mon, 1 Apr 2019 10:51:30 +0000 (UTC) Delivered-To: src-committers@localmail.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [96.47.72.80]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (Client CN "mx1.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by freefall.freebsd.org (Postfix) with ESMTPS id 2F76BA496 for ; Mon, 1 Apr 2019 10:51:27 +0000 (UTC) (envelope-from vmaffione@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E329469D6A; Mon, 1 Apr 2019 10:51:26 +0000 (UTC) (envelope-from vmaffione@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id A6491D50C; Mon, 1 Apr 2019 10:51:26 +0000 (UTC) (envelope-from vmaffione@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id x31ApQkZ034013; Mon, 1 Apr 2019 10:51:26 GMT (envelope-from vmaffione@FreeBSD.org) Received: (from vmaffione@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id x31ApPGg034006; Mon, 1 Apr 2019 10:51:25 GMT (envelope-from vmaffione@FreeBSD.org) Message-Id: <201904011051.x31ApPGg034006@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: vmaffione set sender to vmaffione@FreeBSD.org using -f From: Vincenzo Maffione To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-12@freebsd.org Subject: svn commit: r345762 - in stable/12: sys/dev/netmap sys/net tests/sys/netmap X-SVN-Group: stable-12 X-SVN-Commit-Author: vmaffione X-SVN-Commit-Paths: in stable/12: sys/dev/netmap sys/net tests/sys/netmap X-SVN-Commit-Revision: 345762 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk X-Loop: FreeBSD.org Sender: owner-src-committers@freebsd.org X-Rspamd-Queue-Id: 39A3869D7E X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-2.98 / 15.00]; local_wl_from(0.00)[freebsd.org]; NEURAL_HAM_MEDIUM(-1.00)[-0.997,0]; NEURAL_HAM_SHORT(-0.98)[-0.985,0]; ASN(0.00)[asn:11403, ipnet:96.47.64.0/20, country:US]; NEURAL_HAM_LONG(-1.00)[-1.000,0] Status: O X-BeenThere: svn-src-stable-12@freebsd.org X-Mailman-Version: 2.1.29 List-Id: SVN commit messages for only the 12-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Tue, 03 Sep 2019 14:06:01 -0000 X-Original-Date: Mon, 1 Apr 2019 10:51:25 +0000 (UTC) X-List-Received-Date: Tue, 03 Sep 2019 14:06:01 -0000 Author: vmaffione Date: Mon Apr 1 10:51:24 2019 New Revision: 345762 URL: https://svnweb.freebsd.org/changeset/base/345762 Log: MFC r345269, r345323 netmap: add support for multiple host rings Some applications forward from/to host rings most or all the traffic received or sent on a physical interface. In this cases it is desirable to have more than a pair of RX/TX host rings, and use multiple threads to speed up forwarding. This change adds support for multiple host rings. On registering a netmap port, the user can specify the number of desired receive and transmit host rings in the nr_host_tx_rings and nr_host_rx_rings fields of the nmreq_register structure. Modified: stable/12/sys/dev/netmap/netmap.c stable/12/sys/dev/netmap/netmap_legacy.c stable/12/sys/dev/netmap/netmap_mem2.c stable/12/sys/net/netmap.h stable/12/sys/net/netmap_legacy.h stable/12/sys/net/netmap_user.h stable/12/tests/sys/netmap/ctrl-api-test.c Directory Properties: stable/12/ (props changed) Modified: stable/12/sys/dev/netmap/netmap.c ============================================================================== --- stable/12/sys/dev/netmap/netmap.c Mon Apr 1 07:54:27 2019 (r345761) +++ stable/12/sys/dev/netmap/netmap.c Mon Apr 1 10:51:24 2019 (r345762) @@ -1035,6 +1035,10 @@ netmap_do_unregif(struct netmap_priv_d *priv) } na->nm_krings_delete(na); + + /* restore the default number of host tx and rx rings */ + na->num_host_tx_rings = 1; + na->num_host_rx_rings = 1; } /* possibily decrement counter of tx_si/rx_si users */ @@ -1575,6 +1579,19 @@ netmap_get_na(struct nmreq_header *hdr, *na = ret; netmap_adapter_get(ret); + /* + * if the adapter supports the host rings and it is not alread open, + * try to set the number of host rings as requested by the user + */ + if (((*na)->na_flags & NAF_HOST_RINGS) && (*na)->active_fds == 0) { + if (req->nr_host_tx_rings) + (*na)->num_host_tx_rings = req->nr_host_tx_rings; + if (req->nr_host_rx_rings) + (*na)->num_host_rx_rings = req->nr_host_rx_rings; + } + nm_prdis("%s: host tx %d rx %u", (*na)->name, (*na)->num_host_tx_rings, + (*na)->num_host_rx_rings); + out: if (error) { if (ret) @@ -1856,6 +1873,25 @@ netmap_interp_ringid(struct netmap_priv_d *priv, uint3 nm_prdis("ONE_NIC: %s %d %d", nm_txrx2str(t), priv->np_qfirst[t], priv->np_qlast[t]); break; + case NR_REG_ONE_SW: + if (!(na->na_flags & NAF_HOST_RINGS)) { + nm_prerr("host rings not supported"); + return EINVAL; + } + if (nr_ringid >= na->num_host_tx_rings && + nr_ringid >= na->num_host_rx_rings) { + nm_prerr("invalid ring id %d", nr_ringid); + return EINVAL; + } + /* if not enough rings, use the first one */ + j = nr_ringid; + if (j >= nma_get_host_nrings(na, t)) + j = 0; + priv->np_qfirst[t] = nma_get_nrings(na, t) + j; + priv->np_qlast[t] = nma_get_nrings(na, t) + j + 1; + nm_prdis("ONE_SW: %s %d %d", nm_txrx2str(t), + priv->np_qfirst[t], priv->np_qlast[t]); + break; default: nm_prerr("invalid regif type %d", nr_mode); return EINVAL; @@ -2546,6 +2582,8 @@ netmap_ioctl(struct netmap_priv_d *priv, u_long cmd, c req->nr_tx_rings = na->num_tx_rings; req->nr_rx_slots = na->num_rx_desc; req->nr_tx_slots = na->num_tx_desc; + req->nr_host_tx_rings = na->num_host_tx_rings; + req->nr_host_rx_rings = na->num_host_rx_rings; error = netmap_mem_get_info(na->nm_mem, &req->nr_memsize, &memflags, &req->nr_mem_id); if (error) { @@ -2610,6 +2648,8 @@ netmap_ioctl(struct netmap_priv_d *priv, u_long cmd, c regreq.nr_rx_slots = req->nr_rx_slots; regreq.nr_tx_rings = req->nr_tx_rings; regreq.nr_rx_rings = req->nr_rx_rings; + regreq.nr_host_tx_rings = req->nr_host_tx_rings; + regreq.nr_host_rx_rings = req->nr_host_rx_rings; regreq.nr_mem_id = req->nr_mem_id; /* get a refcount */ @@ -2647,6 +2687,8 @@ netmap_ioctl(struct netmap_priv_d *priv, u_long cmd, c req->nr_tx_rings = na->num_tx_rings; req->nr_rx_slots = na->num_rx_desc; req->nr_tx_slots = na->num_tx_desc; + req->nr_host_tx_rings = na->num_host_tx_rings; + req->nr_host_rx_rings = na->num_host_rx_rings; } while (0); netmap_unget_na(na, ifp); NMG_UNLOCK(); Modified: stable/12/sys/dev/netmap/netmap_legacy.c ============================================================================== --- stable/12/sys/dev/netmap/netmap_legacy.c Mon Apr 1 07:54:27 2019 (r345761) +++ stable/12/sys/dev/netmap/netmap_legacy.c Mon Apr 1 10:51:24 2019 (r345762) @@ -68,6 +68,8 @@ nmreq_register_from_legacy(struct nmreq *nmr, struct n req->nr_rx_slots = nmr->nr_rx_slots; req->nr_tx_rings = nmr->nr_tx_rings; req->nr_rx_rings = nmr->nr_rx_rings; + req->nr_host_tx_rings = 0; + req->nr_host_rx_rings = 0; req->nr_mem_id = nmr->nr_arg2; req->nr_ringid = nmr->nr_ringid & NETMAP_RING_MASK; if ((nmr->nr_flags & NR_REG_MASK) == NR_REG_DEFAULT) { @@ -249,6 +251,8 @@ nmreq_from_legacy(struct nmreq *nmr, u_long ioctl_cmd) req->nr_rx_slots = nmr->nr_rx_slots; req->nr_tx_rings = nmr->nr_tx_rings; req->nr_rx_rings = nmr->nr_rx_rings; + req->nr_host_tx_rings = 0; + req->nr_host_rx_rings = 0; req->nr_mem_id = nmr->nr_arg2; } break; @@ -367,8 +371,8 @@ netmap_ioctl_legacy(struct netmap_priv_d *priv, u_long struct nmreq *nmr = (struct nmreq *) data; struct nmreq_header *hdr; - if (nmr->nr_version < 11) { - nm_prerr("Minimum supported API is 11 (requested %u)", + if (nmr->nr_version < 14) { + nm_prerr("Minimum supported API is 14 (requested %u)", nmr->nr_version); return EINVAL; } Modified: stable/12/sys/dev/netmap/netmap_mem2.c ============================================================================== --- stable/12/sys/dev/netmap/netmap_mem2.c Mon Apr 1 07:54:27 2019 (r345761) +++ stable/12/sys/dev/netmap/netmap_mem2.c Mon Apr 1 10:51:24 2019 (r345762) @@ -2012,6 +2012,10 @@ netmap_mem2_if_new(struct netmap_adapter *na, struct n /* initialize base fields -- override const */ *(u_int *)(uintptr_t)&nifp->ni_tx_rings = na->num_tx_rings; *(u_int *)(uintptr_t)&nifp->ni_rx_rings = na->num_rx_rings; + *(u_int *)(uintptr_t)&nifp->ni_host_tx_rings = + (na->num_host_tx_rings ? na->num_host_tx_rings : 1); + *(u_int *)(uintptr_t)&nifp->ni_host_rx_rings = + (na->num_host_rx_rings ? na->num_host_rx_rings : 1); strlcpy(nifp->ni_name, na->name, sizeof(nifp->ni_name)); /* Modified: stable/12/sys/net/netmap.h ============================================================================== --- stable/12/sys/net/netmap.h Mon Apr 1 07:54:27 2019 (r345761) +++ stable/12/sys/net/netmap.h Mon Apr 1 10:51:24 2019 (r345762) @@ -41,9 +41,9 @@ #ifndef _NET_NETMAP_H_ #define _NET_NETMAP_H_ -#define NETMAP_API 13 /* current API version */ +#define NETMAP_API 14 /* current API version */ -#define NETMAP_MIN_API 13 /* min and max versions accepted */ +#define NETMAP_MIN_API 14 /* min and max versions accepted */ #define NETMAP_MAX_API 15 /* * Some fields should be cache-aligned to reduce contention. @@ -64,34 +64,34 @@ KERNEL (opaque, obviously) ==================================================================== - | - USERSPACE | struct netmap_ring - +---->+---------------+ - / | head,cur,tail | - struct netmap_if (nifp, 1 per fd) / | buf_ofs | - +---------------+ / | other fields | - | ni_tx_rings | / +===============+ - | ni_rx_rings | / | buf_idx, len | slot[0] - | | / | flags, ptr | - | | / +---------------+ - +===============+ / | buf_idx, len | slot[1] - | txring_ofs[0] | (rel.to nifp)--' | flags, ptr | - | txring_ofs[1] | +---------------+ - (tx+1 entries) (num_slots entries) - | txring_ofs[t] | | buf_idx, len | slot[n-1] - +---------------+ | flags, ptr | - | rxring_ofs[0] | +---------------+ - | rxring_ofs[1] | - (rx+1 entries) - | rxring_ofs[r] | - +---------------+ + | + USERSPACE | struct netmap_ring + +---->+---------------+ + / | head,cur,tail | + struct netmap_if (nifp, 1 per fd) / | buf_ofs | + +----------------+ / | other fields | + | ni_tx_rings | / +===============+ + | ni_rx_rings | / | buf_idx, len | slot[0] + | | / | flags, ptr | + | | / +---------------+ + +================+ / | buf_idx, len | slot[1] + | txring_ofs[0] | (rel.to nifp)--' | flags, ptr | + | txring_ofs[1] | +---------------+ + (tx+htx entries) (num_slots entries) + | txring_ofs[t] | | buf_idx, len | slot[n-1] + +----------------+ | flags, ptr | + | rxring_ofs[0] | +---------------+ + | rxring_ofs[1] | + (rx+hrx entries) + | rxring_ofs[r] | + +----------------+ * For each "interface" (NIC, host stack, PIPE, VALE switch port) bound to * a file descriptor, the mmap()ed region contains a (logically readonly) * struct netmap_if pointing to struct netmap_ring's. * - * There is one netmap_ring per physical NIC ring, plus one tx/rx ring - * pair attached to the host stack (this pair is unused for non-NIC ports). + * There is one netmap_ring per physical NIC ring, plus at least one tx/rx ring + * pair attached to the host stack (these pairs are unused for non-NIC ports). * * All physical/host stack ports share the same memory region, * so that zero-copy can be implemented between them. @@ -117,11 +117,6 @@ * as the index. On close, ni_bufs_head must point to the list of * buffers to be released. * - * + NIOCREGIF can request space for extra rings (and buffers) - * allocated in the same memory space. The number of extra rings - * is in nr_arg1, and is advisory. This is a no-op on NICs where - * the size of the memory space is fixed. - * * + NIOCREGIF can attach to PIPE rings sharing the same memory * space with a parent device. The ifname indicates the parent device, * which must already exist. Flags in nr_flags indicate if we want to @@ -133,21 +128,22 @@ * * Extra flags in nr_flags support the above functions. * Application libraries may use the following naming scheme: - * netmap:foo all NIC ring pairs - * netmap:foo^ only host ring pair - * netmap:foo+ all NIC ring + host ring pairs - * netmap:foo-k the k-th NIC ring pair - * netmap:foo{k PIPE ring pair k, master side - * netmap:foo}k PIPE ring pair k, slave side + * netmap:foo all NIC rings pairs + * netmap:foo^ only host rings pairs + * netmap:foo^k the k-th host rings pair + * netmap:foo+ all NIC rings + host rings pairs + * netmap:foo-k the k-th NIC rings pair + * netmap:foo{k PIPE rings pair k, master side + * netmap:foo}k PIPE rings pair k, slave side * * Some notes about host rings: * - * + The RX host ring is used to store those packets that the host network + * + The RX host rings are used to store those packets that the host network * stack is trying to transmit through a NIC queue, but only if that queue * is currently in netmap mode. Netmap will not intercept host stack mbufs * designated to NIC queues that are not in netmap mode. As a consequence, * registering a netmap port with netmap:foo^ is not enough to intercept - * mbufs in the RX host ring; the netmap port should be registered with + * mbufs in the RX host rings; the netmap port should be registered with * netmap:foo*, or another registration should be done to open at least a * NIC TX queue in netmap mode. * @@ -157,7 +153,7 @@ * ifconfig on FreeBSD or ethtool -K on Linux) for an interface that is being * used in netmap mode. If the offloadings are not disabled, GSO and/or * unchecksummed packets may be dropped immediately or end up in the host RX - * ring, and will be dropped as soon as the packet reaches another netmap + * rings, and will be dropped as soon as the packet reaches another netmap * adapter. */ @@ -366,7 +362,7 @@ struct netmap_if { /* * The number of packet rings available in netmap mode. * Physical NICs can have different numbers of tx and rx rings. - * Physical NICs also have a 'host' ring pair. + * Physical NICs also have at least a 'host' rings pair. * Additionally, clients can request additional ring pairs to * be used for internal communication. */ @@ -374,14 +370,18 @@ struct netmap_if { const uint32_t ni_rx_rings; /* number of HW rx rings */ uint32_t ni_bufs_head; /* head index for extra bufs */ - uint32_t ni_spare1[5]; + const uint32_t ni_host_tx_rings; /* number of SW tx rings */ + const uint32_t ni_host_rx_rings; /* number of SW rx rings */ + uint32_t ni_spare1[3]; /* * The following array contains the offset of each netmap ring * from this structure, in the following order: - * NIC tx rings (ni_tx_rings); host tx ring (1); extra tx rings; - * NIC rx rings (ni_rx_rings); host tx ring (1); extra rx rings. + * - NIC tx rings (ni_tx_rings); + * - host tx rings (ni_host_tx_rings); + * - NIC rx rings (ni_rx_rings); + * - host rx ring (ni_host_rx_rings); * - * The area is filled up by the kernel on NIOCREGIF, + * The area is filled up by the kernel on NETMAP_REQ_REGISTER, * and then only read by userspace code. */ const ssize_t ring_ofs[0]; @@ -422,7 +422,8 @@ struct netmap_if { * The request body (struct nmreq_register) has several arguments to * specify how the port is to be registered. * - * nr_tx_slots, nr_tx_slots, nr_tx_rings, nr_rx_rings (in/out) + * nr_tx_slots, nr_tx_slots, nr_tx_rings, nr_rx_rings, + * nr_host_tx_rings, nr_host_rx_rings (in/out) * On input, non-zero values may be used to reconfigure the port * according to the requested values, but this is not guaranteed. * On output the actual values in use are reported. @@ -574,6 +575,8 @@ struct nmreq_register { uint32_t nr_rx_slots; /* slots in rx rings */ uint16_t nr_tx_rings; /* number of tx rings */ uint16_t nr_rx_rings; /* number of rx rings */ + uint16_t nr_host_tx_rings; /* number of host tx rings */ + uint16_t nr_host_rx_rings; /* number of host rx rings */ uint16_t nr_mem_id; /* id of the memory allocator */ uint16_t nr_ringid; /* ring(s) we care about */ @@ -592,9 +595,9 @@ struct nmreq_register { #define NR_TX_RINGS_ONLY 0x4000 /* Applications set this flag if they are able to deal with virtio-net headers, * that is send/receive frames that start with a virtio-net header. - * If not set, NIOCREGIF will fail with netmap ports that require applications - * to use those headers. If the flag is set, the application can use the - * NETMAP_VNET_HDR_GET command to figure out the header length. */ + * If not set, NETMAP_REQ_REGISTER will fail with netmap ports that require + * applications to use those headers. If the flag is set, the application can + * use the NETMAP_VNET_HDR_GET command to figure out the header length. */ #define NR_ACCEPT_VNET_HDR 0x8000 /* The following two have the same meaning of NETMAP_NO_TX_POLL and * NETMAP_DO_RX_POLL. */ @@ -611,6 +614,7 @@ enum { NR_REG_DEFAULT = 0, /* backward compat, should NR_REG_PIPE_MASTER = 5, /* deprecated, use "x{y" port name syntax */ NR_REG_PIPE_SLAVE = 6, /* deprecated, use "x}y" port name syntax */ NR_REG_NULL = 7, + NR_REG_ONE_SW = 8, }; /* A single ioctl number is shared by all the new API command. @@ -622,7 +626,7 @@ enum { NR_REG_DEFAULT = 0, /* backward compat, should /* The ioctl commands to sync TX/RX netmap rings. * NIOCTXSYNC, NIOCRXSYNC synchronize tx or rx queues, - * whose identity is set in NIOCREGIF through nr_ringid. + * whose identity is set in NETMAP_REQ_REGISTER through nr_ringid. * These are non blocking and take no argument. */ #define NIOCTXSYNC _IO('i', 148) /* sync tx queues */ #define NIOCRXSYNC _IO('i', 149) /* sync rx queues */ @@ -640,8 +644,10 @@ struct nmreq_port_info_get { uint32_t nr_rx_slots; /* slots in rx rings */ uint16_t nr_tx_rings; /* number of tx rings */ uint16_t nr_rx_rings; /* number of rx rings */ + uint16_t nr_host_tx_rings; /* number of host tx rings */ + uint16_t nr_host_rx_rings; /* number of host rx rings */ uint16_t nr_mem_id; /* memory allocator id (in/out) */ - uint16_t pad1; + uint16_t pad[3]; }; #define NM_BDG_NAME "vale" /* prefix for bridge port name */ Modified: stable/12/sys/net/netmap_legacy.h ============================================================================== --- stable/12/sys/net/netmap_legacy.h Mon Apr 1 07:54:27 2019 (r345761) +++ stable/12/sys/net/netmap_legacy.h Mon Apr 1 10:51:24 2019 (r345762) @@ -99,14 +99,7 @@ * nr_flags is the recommended mode to indicate which rings should * be bound to a file descriptor. Values are NR_REG_* * - * nr_arg1 (in) The number of extra rings to be reserved. - * Especially when allocating a VALE port the system only - * allocates the amount of memory needed for the port. - * If more shared memory rings are desired (e.g. for pipes), - * the first invocation for the same basename/allocator - * should specify a suitable number. Memory cannot be - * extended after the first allocation without closing - * all ports on the same region. + * nr_arg1 (in) Reserved. * * nr_arg2 (in/out) The identity of the memory region used. * On input, 0 means the system decides autonomously, @@ -188,7 +181,7 @@ struct nmreq { #define NETMAP_BDG_POLLING_ON 10 /* delete polling kthread */ #define NETMAP_BDG_POLLING_OFF 11 /* delete polling kthread */ #define NETMAP_VNET_HDR_GET 12 /* get the port virtio-net-hdr length */ - uint16_t nr_arg1; /* reserve extra rings in NIOCREGIF */ + uint16_t nr_arg1; /* extra arguments */ #define NETMAP_BDG_HOST 1 /* nr_arg1 value for NETMAP_BDG_ATTACH */ uint16_t nr_arg2; /* id of the memory allocator */ Modified: stable/12/sys/net/netmap_user.h ============================================================================== --- stable/12/sys/net/netmap_user.h Mon Apr 1 07:54:27 2019 (r345761) +++ stable/12/sys/net/netmap_user.h Mon Apr 1 10:51:24 2019 (r345762) @@ -93,6 +93,8 @@ #include /* apple needs sockaddr */ #include /* IFNAMSIZ */ #include +#include /* memset */ +#include /* gettimeofday */ #ifndef likely #define likely(x) __builtin_expect(!!(x), 1) @@ -111,7 +113,8 @@ nifp, (nifp)->ring_ofs[index] ) #define NETMAP_RXRING(nifp, index) _NETMAP_OFFSET(struct netmap_ring *, \ - nifp, (nifp)->ring_ofs[index + (nifp)->ni_tx_rings + 1] ) + nifp, (nifp)->ring_ofs[index + (nifp)->ni_tx_rings + \ + (nifp)->ni_host_tx_rings] ) #define NETMAP_BUF(ring, index) \ ((char *)(ring) + (ring)->buf_ofs + ((index)*(ring)->nr_buf_size)) @@ -149,27 +152,6 @@ nm_ring_space(struct netmap_ring *ring) return ret; } - -#ifdef NETMAP_WITH_LIBS -/* - * Support for simple I/O libraries. - * Include other system headers required for compiling this. - */ - -#ifndef HAVE_NETMAP_WITH_LIBS -#define HAVE_NETMAP_WITH_LIBS - -#include -#include -#include -#include /* memset */ -#include -#include /* EINVAL */ -#include /* O_RDWR */ -#include /* close() */ -#include -#include - #ifndef ND /* debug macros */ /* debug support */ #define ND(_fmt, ...) do {} while(0) @@ -198,6 +180,53 @@ nm_ring_space(struct netmap_ring *ring) } while (0) #endif +/* + * this is a slightly optimized copy routine which rounds + * to multiple of 64 bytes and is often faster than dealing + * with other odd sizes. We assume there is enough room + * in the source and destination buffers. + */ +static inline void +nm_pkt_copy(const void *_src, void *_dst, int l) +{ + const uint64_t *src = (const uint64_t *)_src; + uint64_t *dst = (uint64_t *)_dst; + + if (unlikely(l >= 1024 || l % 64)) { + memcpy(dst, src, l); + return; + } + for (; likely(l > 0); l-=64) { + *dst++ = *src++; + *dst++ = *src++; + *dst++ = *src++; + *dst++ = *src++; + *dst++ = *src++; + *dst++ = *src++; + *dst++ = *src++; + *dst++ = *src++; + } +} + +#ifdef NETMAP_WITH_LIBS +/* + * Support for simple I/O libraries. + * Include other system headers required for compiling this. + */ + +#ifndef HAVE_NETMAP_WITH_LIBS +#define HAVE_NETMAP_WITH_LIBS + +#include +#include +#include +#include +#include /* EINVAL */ +#include /* O_RDWR */ +#include /* close() */ +#include +#include + struct nm_pkthdr { /* first part is the same as pcap_pkthdr */ struct timeval ts; uint32_t caplen; @@ -268,33 +297,6 @@ struct nm_desc { #define NETMAP_FD(d) (P2NMD(d)->fd) -/* - * this is a slightly optimized copy routine which rounds - * to multiple of 64 bytes and is often faster than dealing - * with other odd sizes. We assume there is enough room - * in the source and destination buffers. - */ -static inline void -nm_pkt_copy(const void *_src, void *_dst, int l) -{ - const uint64_t *src = (const uint64_t *)_src; - uint64_t *dst = (uint64_t *)_dst; - - if (unlikely(l >= 1024 || l % 64)) { - memcpy(dst, src, l); - return; - } - for (; likely(l > 0); l-=64) { - *dst++ = *src++; - *dst++ = *src++; - *dst++ = *src++; - *dst++ = *src++; - *dst++ = *src++; - *dst++ = *src++; - *dst++ = *src++; - *dst++ = *src++; - } -} /* Modified: stable/12/tests/sys/netmap/ctrl-api-test.c ============================================================================== --- stable/12/tests/sys/netmap/ctrl-api-test.c Mon Apr 1 07:54:27 2019 (r345761) +++ stable/12/tests/sys/netmap/ctrl-api-test.c Mon Apr 1 10:51:24 2019 (r345762) @@ -146,12 +146,12 @@ struct TestContext { uint32_t nr_hdr_len; /* for PORT_HDR_SET and PORT_HDR_GET */ uint32_t nr_first_cpu_id; /* vale polling */ uint32_t nr_num_polling_cpus; /* vale polling */ + uint32_t sync_kloop_mode; /* sync-kloop */ int fd; /* netmap file descriptor */ void *csb; /* CSB entries (atok and ktoa) */ struct nmreq_option *nr_opt; /* list of options */ sem_t *sem; /* for thread synchronization */ - struct nmport_d *nmport; /* nmport descriptor from libnetmap */ }; static struct TestContext ctx_; @@ -352,8 +352,11 @@ niocregif(struct TestContext *ctx, int netmap_api) /* The 11 ABI is the one right before the introduction of the new NIOCCTRL * ABI. The 11 ABI is useful to perform tests with legacy applications - * (which use the 11 ABI) and new kernel (which uses 12, or higher). */ -#define NETMAP_API_NIOCREGIF 11 + * (which use the 11 ABI) and new kernel (which uses 12, or higher). + * However, version 14 introduced a change in the layout of struct netmap_if, + * so that binary backward compatibility to 11 is not supported anymore. + */ +#define NETMAP_API_NIOCREGIF 14 static int legacy_regif_default(struct TestContext *ctx) @@ -1113,7 +1116,7 @@ bad_extmem_option(struct TestContext *ctx) pools_info_fill(&pools_info); /* Request a large ring size, to make sure that the kernel * rejects our request. */ - pools_info.nr_ring_pool_objsize = (1 << 16); + pools_info.nr_ring_pool_objsize = (1 << 20); return _extmem_option(ctx, &pools_info) < 0 ? 0 : -1; } @@ -1140,6 +1143,10 @@ duplicate_extmem_options(struct TestContext *ctx) save1 = e1; save2 = e2; + strncpy(ctx->ifname_ext, "vale0:0", sizeof(ctx->ifname_ext)); + ctx->nr_tx_slots = 16; + ctx->nr_rx_slots = 16; + ret = port_register_hwall(ctx); if (ret >= 0) { printf("duplicate option not detected\n"); @@ -1322,51 +1329,58 @@ sync_kloop(struct TestContext *ctx) static int sync_kloop_eventfds(struct TestContext *ctx) { - struct nmreq_opt_sync_kloop_eventfds *opt = NULL; - struct nmreq_option save; + struct nmreq_opt_sync_kloop_eventfds *evopt = NULL; + struct nmreq_opt_sync_kloop_mode modeopt; + struct nmreq_option evsave; int num_entries; size_t opt_size; int ret, i; + memset(&modeopt, 0, sizeof(modeopt)); + modeopt.nro_opt.nro_reqtype = NETMAP_REQ_OPT_SYNC_KLOOP_MODE; + modeopt.mode = ctx->sync_kloop_mode; + push_option(&modeopt.nro_opt, ctx); + num_entries = num_registered_rings(ctx); - opt_size = sizeof(*opt) + num_entries * sizeof(opt->eventfds[0]); - opt = calloc(1, opt_size); - opt->nro_opt.nro_next = 0; - opt->nro_opt.nro_reqtype = NETMAP_REQ_OPT_SYNC_KLOOP_EVENTFDS; - opt->nro_opt.nro_status = 0; - opt->nro_opt.nro_size = opt_size; + opt_size = sizeof(*evopt) + num_entries * sizeof(evopt->eventfds[0]); + evopt = calloc(1, opt_size); + evopt->nro_opt.nro_next = 0; + evopt->nro_opt.nro_reqtype = NETMAP_REQ_OPT_SYNC_KLOOP_EVENTFDS; + evopt->nro_opt.nro_status = 0; + evopt->nro_opt.nro_size = opt_size; for (i = 0; i < num_entries; i++) { int efd = eventfd(0, 0); - opt->eventfds[i].ioeventfd = efd; + evopt->eventfds[i].ioeventfd = efd; efd = eventfd(0, 0); - opt->eventfds[i].irqfd = efd; + evopt->eventfds[i].irqfd = efd; } - push_option(&opt->nro_opt, ctx); - save = opt->nro_opt; + push_option(&evopt->nro_opt, ctx); + evsave = evopt->nro_opt; ret = sync_kloop_start_stop(ctx); if (ret != 0) { - free(opt); + free(evopt); clear_options(ctx); return ret; } #ifdef __linux__ - save.nro_status = 0; + evsave.nro_status = 0; #else /* !__linux__ */ - save.nro_status = EOPNOTSUPP; + evsave.nro_status = EOPNOTSUPP; #endif /* !__linux__ */ - ret = checkoption(&opt->nro_opt, &save); - free(opt); + ret = checkoption(&evopt->nro_opt, &evsave); + free(evopt); clear_options(ctx); return ret; } static int -sync_kloop_eventfds_all(struct TestContext *ctx) +sync_kloop_eventfds_all_mode(struct TestContext *ctx, + uint32_t sync_kloop_mode) { int ret; @@ -1375,10 +1389,18 @@ sync_kloop_eventfds_all(struct TestContext *ctx) return ret; } + ctx->sync_kloop_mode = sync_kloop_mode; + return sync_kloop_eventfds(ctx); } static int +sync_kloop_eventfds_all(struct TestContext *ctx) +{ + return sync_kloop_eventfds_all_mode(ctx, 0); +} + +static int sync_kloop_eventfds_all_tx(struct TestContext *ctx) { struct nmreq_opt_csb opt; @@ -1399,6 +1421,27 @@ sync_kloop_eventfds_all_tx(struct TestContext *ctx) } static int +sync_kloop_eventfds_all_direct(struct TestContext *ctx) +{ + return sync_kloop_eventfds_all_mode(ctx, + NM_OPT_SYNC_KLOOP_DIRECT_TX | NM_OPT_SYNC_KLOOP_DIRECT_RX); +} + +static int +sync_kloop_eventfds_all_direct_tx(struct TestContext *ctx) +{ + return sync_kloop_eventfds_all_mode(ctx, + NM_OPT_SYNC_KLOOP_DIRECT_TX); +} + +static int +sync_kloop_eventfds_all_direct_rx(struct TestContext *ctx) +{ + return sync_kloop_eventfds_all_mode(ctx, + NM_OPT_SYNC_KLOOP_DIRECT_RX); +} + +static int sync_kloop_nocsb(struct TestContext *ctx) { int ret; @@ -1677,6 +1720,9 @@ static struct mytest tests[] = { decltest(sync_kloop), decltest(sync_kloop_eventfds_all), decltest(sync_kloop_eventfds_all_tx), + decltest(sync_kloop_eventfds_all_direct), + decltest(sync_kloop_eventfds_all_direct_tx), + decltest(sync_kloop_eventfds_all_direct_rx), decltest(sync_kloop_nocsb), decltest(sync_kloop_csb_enable), decltest(sync_kloop_conflict),