Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 4 Sep 2013 17:04:23 +0900
From:      Takuya ASADA <syuu@dokukino.com>
To:        Luigi Rizzo <rizzo@iet.unipi.it>
Cc:        FreeBSD Net <freebsd-net@freebsd.org>
Subject:   Re: Multiqueue support for bpf
Message-ID:  <CALG4x-UVgT9aXSBzrjDxeCD-f6Yo_TBeRsqjXapz2iyr_=tCLw@mail.gmail.com>
In-Reply-To: <CA%2BhQ2%2Bi_qu7RouPW%2Bihfb5nL_1SQWMpFxTpoHfhaCvhtS8-EHQ@mail.gmail.com>
References:  <CALG4x-V-OLoqMXQarSNy5Lv3kNVu01AiN4A49Nv7t-Ysfr1DBg@mail.gmail.com> <CA%2BhQ2%2BgwW6FOQS79xmWVLSWWHrZMFnhaUM98Kp6aDVaUePNfTA@mail.gmail.com> <CALG4x-UYBFsMttpZx1-c_wtVf5MST8%2B_t1psY2HQskTiOZDFLA@mail.gmail.com> <CA%2BhQ2%2Bi_qu7RouPW%2Bihfb5nL_1SQWMpFxTpoHfhaCvhtS8-EHQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

[-- Attachment #1 --]
Hi,

This is 2nd version of multiqueue bpf patch, I think I fixed things what
you commented on previous mail.
Here's a change list of the patch:

- Drop added functions on struct
ifnet(if_get_[rt]xqueue_len/if_get_[rt]xqueue_affinity).
HW queue number and queue affinity informations are maybe useful for some
applications, but it's not really directly related to multiqueue bpf. I
think we should discuss them separately.

- Use BITSET for queue mask.
It seems better to use BITSET for queue mask structure, instead of boolean
array.

- Drop tcpdump/libpcap changes.
It also should discuss separately.

- M_QUEUEID/IFCAP_QUEUEID
M_QUEUEID is the flag for mbuf which contains hw queue id.
IFCAP_QUEUEID is the flag which means the driver has ability to set queue
id on mbuf.



2013/7/3 Luigi Rizzo <rizzo@iet.unipi.it>

>
>
>
> On Tue, Jul 2, 2013 at 5:56 PM, Takuya ASADA <syuu@dokukino.com> wrote:
>
>> Hi,
>>
>>  Do you have an updated URL for the diffs ? The link below from your
>>> original message
>>> seems not working now (NXDOMAIN)
>>>
>>> http://www.dokukino.com/mq_bpf_20110813.diff
>>>
>>
>> Changes with recent head is on my repository:
>> http://svnweb.freebsd.org/base/user/syuu/mq_bpf/
>> And I attached a diff file on this mail.
>>
>>
> thanks for the diffs (the URL to the repo is useful too,
>  but a URL to generate diffs is more convenient for reviewing changes).
>
> I believe it still needs a bit of work before being merged.
>
> My comments (in order of the patch):
>
> === ifnet.9 (and related code in if.c, sockio.h) ===
>  - if_get_rxqueue_len()/if_get_rxqueue_len() is not a good name,
>   as to me at least it suggests that it returns the size of the
>   individual queue, rather than the number of queues.
>
> - cpu affinity (in userspace) is a bitmask, whereas in the BSD kernel
>   we almost never use the term "affinity", and favour "couid" or "oncpu"
>   (i.e. an individual CPU id).
>   I think you should either rename if_get_txqueue_affinity(), or make
>   the return type a cpuset (which seems more sensible as the return
>   value is passed to userspace)
>
> === bpf.4 (and related code) ===
>
> - the ioctl() to attach/detach/report queues attached to a specific
>   bpf descriptor talk about "mask bit" but neither the internal nor
>   the external implementation deal with bits.
>   I'd rather document those ioctl as "attaching queue to file descriptor".
>
> - the BPF ioctl names are generally inconsistent (using either S or SET
>   and G or GET for the setter and getter methods).
>   But you should pick one of the patterns and stick with it,
>   not introduce a third variant (GT/ST).
>   Given we are in 2013 we might go for the long form GET and SET
>   so i suggest the following (spaces for clarity)
>
> +#define BIOC ENA QMASK _IO('B', 133)
> +#define BIOC DIS QMASK _IO('B', 134)
> +#define BIOC SET RXQMASK _IOWR('B', 135, uint32_t)
> +#define BIOC CLR RXQMASK _IOWR('B', 136, uint32_t)
> +#define BIOC GET RXQMASK _IOR('B', 137, uint32_t)
> +#define BIOC SET TXQMASK _IOWR('B', 138, uint32_t)
> +#define BIOC CLR TXQMASK _IOWR('B', 139, uint32_t)
> +#define BIOC GET TXQMASK _IOR('B', 140, uint32_t)
> +#define BIOC SET OTHERMASK _IO('B', 141)
> +#define BIOC CLR OTHERMASK _IO('B', 142)
> +#define BIOC GET OTHERMASK _IOR('B', 143, uint32_t)
>
>   Also related: the existing ioctls() use u_int as argumnts, rather
>   than uint32_t. I personally prefer the uint32_t form, but you
>   should at least add a comment to indicate that the choice is
>   deliberate.
>
> === if.c ===
>
>
> - you have a KASSERT to report if ifp->if_get_*xqueue_affinity() is not
>   set, but i'd rather run the function only if is set, so you can
>   have a multiqueue interface which does not bind queues to specific cores
>   (which i am not sure is always a great idea; too many processes
>   statically bound to the same queue mean you lose opportunity to
>   parallelize work.)
>
> === mbuf.h ===
>
> as mentioned earlier, the modification to struct mbuf should
> be avoided if possible at all. It seems that you need just one
> direction bit (which maybe is available already from the context)
> and one queue identifier, which in the rx path, at least in your
> implementation is always a replica of the 'flowid' field.
> Can you see if perhaps the flowid field can be (ab)used on the
> tx path as well ?
>
>
> === if.h ===
>
> - in if.h, can you use individual variables instead of arrays
>   for  ifr_queue_affinity_index and friends ?
>   The macros to map the fields of ifr_ifru one
>   level up are a necessary evil,
>   but there is no point in using the arrays.
>
>   - SIOCGIFTXQAFFINITY seems to use the receive function (copy&paste typo)
>    talks about
>   Also, this function is probably something that should be coordinated
>   with work on generic multiqueue support
>
>
> === bpf.c ===
>
> - in linux (and hopefully in FreeBSD at some point) the number of queues
>   can be changed at runtime.
>   So i suggest that you cache the current number of queues when
>   you allocate the arrays (qm_*xq_qmask[] ) rather than invoking
>   ifp->if_get_*xqueue_len() everytime you need to do a boundary check.
>   This will save us from all sort of problems later.
>
> - in terms of code, the six BIOC*XQMASK are very similar, you are probably
>   better off having one single case in the switch
>
> - can you some comments in the code for the chunk at @@ -2117,6 +2391,42 @@
>   I do not completely understand why you are returning if the *queue tag
>   in the mbuf is out of range (my impression is that you should
>   just continue, or if you think the packet is incorrect it should
>   be filtered out before entering the LIST_FOREACH() ).
>   Secondly, you should use the cached value of *queue_len
>
>
>
> cheers
> luigi
>
>
> --
> -----------------------------------------+-------------------------------
>  Prof. Luigi RIZZO, rizzo@iet.unipi.it  . Dip. di Ing. dell'Informazione
>  http://www.iet.unipi.it/~luigi/        . Universita` di Pisa
>  TEL      +39-050-2211611               . via Diotisalvi 2
>  Mobile   +39-338-6809875               . 56122 PISA (Italy)
>
>  -----------------------------------------+-------------------------------
>
>

[-- Attachment #2 --]
Index: share/man/man4/bpf.4
===================================================================
--- share/man/man4/bpf.4	(.../head)	(revision 255180)
+++ share/man/man4/bpf.4	(.../user/syuu/mq_bpf)	(revision 255200)
@@ -631,6 +631,36 @@
 .Vt bzh_kernel_gen
 against
 .Vt bzh_user_gen .
+.It Dv BIOCQMASKENABLE
+Enables multiqueue filter on the descriptor.
+
+.It Dv BIOCQMASKDISABLE
+Disables multiqueue filter on the descriptor.
+
+.It Dv BIOCGRXQMASK
+.Pq Li struct bpf_qmask_bits
+Set RX queue mask bits.
+
+.It Dv BIOCSRXQMASK
+.Pq Li struct bpf_qmask_bits
+Get RX queue mask bits.
+
+.It Dv BIOCGTXQMASK
+.Pq Li struct bpf_qmask_bits
+Set TX queue mask bits.
+
+.It Dv BIOCSTXQMASK
+.Pq Li struct bpf_qmask_bits
+Get TX queue mask bits.
+
+.It Dv BIOCGNOQMASK
+.Pq Li int
+Set mask bit for the packets which not tied with any queues.
+
+.It Dv BIOCSNOQMASK
+.Pq Li int
+Get mask bit for the packets which not tied with any queues.
+
 .El
 .Sh BPF HEADER
 One of the following structures is prepended to each packet returned by
@@ -1037,6 +1067,23 @@
 	BPF_STMT(BPF_RET+BPF_K, 0),
 };
 .Ed
+.Sh MULTIQUEUE SUPPORT
+Multiqueue network interface support function provides interfaces for 
+multithreaded packet processing using bpf.
+
+Normal bpf can receive packets from specified interface, multiqueue support 
+function can receive packets from specified hardware queue.
+
+This distributes bpf workload on multiple threads, also reduces lock 
+contention on bpf.
+
+To make your program multithreaded, you'll need to open bpf descriptor on each 
+thread, enable multiqueue support by BIOCQMASKENABLE ioctl, and set queue mask by BIOCSRXQMASK / BIOCSTXQMASK / BIOCSNOQMASK ioctls.
+
+Queue length and queue affinity information may useful to optimize setting 
+queue mask on bpf descriptor, see
+.Xr netintro 4 .
+
 .Sh SEE ALSO
 .Xr tcpdump 1 ,
 .Xr ioctl 2 ,
Index: sys/dev/ixgbe/ixgbe.c
===================================================================
--- sys/dev/ixgbe/ixgbe.c	(.../head)	(revision 255180)
+++ sys/dev/ixgbe/ixgbe.c	(.../user/syuu/mq_bpf)	(revision 255200)
@@ -751,6 +751,11 @@
 				IFQ_DRV_PREPEND(&ifp->if_snd, m_head);
 			break;
 		}
+
+		m_head->m_flags |= M_QUEUEID;
+		m_head->m_pkthdr.queueid = txr->me;
+		m_head->m_pkthdr.queuetype = QUEUETYPE_TX;
+
 		/* Send a copy of the frame to the BPF listener */
 		ETHER_BPF_MTAP(ifp, m_head);
 
@@ -849,6 +854,11 @@
 		drbr_advance(ifp, txr->br);
 #endif
 		enqueued++;
+ 
+		next->m_flags |= M_QUEUEID;
+ 		next->m_pkthdr.queueid = txr->me;
+		next->m_pkthdr.queuetype = QUEUETYPE_TX;
+
 		/* Send a copy of the frame to the BPF listener */
 		ETHER_BPF_MTAP(ifp, next);
 		if ((ifp->if_drv_flags & IFF_DRV_RUNNING) == 0)
@@ -2637,6 +2647,7 @@
 	ifp->if_capabilities |= IFCAP_VLAN_HWTAGGING
 			     |  IFCAP_VLAN_HWTSO
 			     |  IFCAP_VLAN_MTU;
+	ifp->if_capabilities |= IFCAP_QUEUEID;
 	ifp->if_capenable = ifp->if_capabilities;
 
 	/*
@@ -4546,8 +4557,10 @@
 				ixgbe_rx_checksum(staterr, sendmp, ptype);
 #if __FreeBSD_version >= 800000
 			sendmp->m_pkthdr.flowid = que->msix;
-			sendmp->m_flags |= M_FLOWID;
+			sendmp->m_flags |= (M_FLOWID | M_QUEUEID);
 #endif
+			sendmp->m_pkthdr.queueid = que->msix;
+			sendmp->m_pkthdr.queuetype = QUEUETYPE_RX;
 		}
 next_desc:
 		bus_dmamap_sync(rxr->rxdma.dma_tag, rxr->rxdma.dma_map,
Index: sys/dev/e1000/if_igb.c
===================================================================
--- sys/dev/e1000/if_igb.c	(.../head)	(revision 255180)
+++ sys/dev/e1000/if_igb.c	(.../user/syuu/mq_bpf)	(revision 255200)
@@ -920,6 +920,10 @@
 			break;
 		}
 
+		m_head->m_flags |= M_QUEUEID;
+		m_head->m_pkthdr.queueid = txr->me;
+		m_head->m_pkthdr.queuetype = QUEUETYPE_TX;
+
 		/* Send a copy of the frame to the BPF listener */
 		ETHER_BPF_MTAP(ifp, m_head);
 
@@ -1019,6 +1023,9 @@
 		ifp->if_obytes += next->m_pkthdr.len;
 		if (next->m_flags & M_MCAST)
 			ifp->if_omcasts++;
+		next->m_flags |= M_QUEUEID;
+		next->m_pkthdr.queueid = txr->me;
+		next->m_pkthdr.queuetype = QUEUETYPE_TX;
 		ETHER_BPF_MTAP(ifp, next);
 		if ((ifp->if_drv_flags & IFF_DRV_RUNNING) == 0)
 			break;
@@ -3169,6 +3176,7 @@
 	** enable this and get full hardware tag filtering.
 	*/
 	ifp->if_capabilities |= IFCAP_VLAN_HWFILTER;
+	ifp->if_capabilities |= IFCAP_QUEUEID;
 
 	/*
 	 * Specify the media types supported by this adapter and register
@@ -4891,8 +4899,11 @@
 			}
 #ifndef IGB_LEGACY_TX
 			rxr->fmp->m_pkthdr.flowid = que->msix;
-			rxr->fmp->m_flags |= M_FLOWID;
+			rxr->fmp->m_flags |= (M_FLOWID | M_QUEUEID);
 #endif
+			rxr->fmp->m_pkthdr.queueid = que->msix;
+			rxr->fmp->m_pkthdr.queuetype = QUEUETYPE_TX;
+
 			sendmp = rxr->fmp;
 			/* Make sure to set M_PKTHDR. */
 			sendmp->m_flags |= M_PKTHDR;
Index: sys/dev/mxge/if_mxge.c
===================================================================
--- sys/dev/mxge/if_mxge.c	(.../head)	(revision 255180)
+++ sys/dev/mxge/if_mxge.c	(.../user/syuu/mq_bpf)	(revision 255200)
@@ -2272,6 +2272,10 @@
 		if (m == NULL) {
 			return;
 		}
+		m->m_flags |= M_QUEUEID;
+		m->m_pkthdr.queueid = (ss - sc->ss);
+		m->m_pkthdr.queuetype = QUEUETYPE_TX;
+
 		/* let BPF see it */
 		BPF_MTAP(ifp, m);
 
@@ -2306,6 +2310,10 @@
 
 	if (!drbr_needs_enqueue(ifp, tx->br) &&
 	    ((tx->mask - (tx->req - tx->done)) > tx->max_desc)) {
+		m->m_flags |= M_QUEUEID;
+		m->m_pkthdr.queueid = (ss - sc->ss);
+		m->m_pkthdr.queuetype = QUEUETYPE_TX;
+
 		/* let BPF see it */
 		BPF_MTAP(ifp, m);
 		/* give it to the nic */
@@ -2718,7 +2726,9 @@
 	/* flowid only valid if RSS hashing is enabled */
 	if (sc->num_slices > 1) {
 		m->m_pkthdr.flowid = (ss - sc->ss);
-		m->m_flags |= M_FLOWID;
+		m->m_flags |= (M_FLOWID | M_QUEUEID);
+		m->m_pkthdr.queueid = (ss - sc->ss);
+		m->m_pkthdr.queuetype = QUEUETYPE_RX;
 	}
 	/* pass the frame up the stack */
 	(*ifp->if_input)(ifp, m);
@@ -4893,6 +4903,7 @@
 #if defined(INET) || defined(INET6)
 	ifp->if_capabilities |= IFCAP_LRO;
 #endif
+	ifp->if_capabilities |= IFCAP_QUEUEID;
 
 #ifdef MXGE_NEW_VLAN_API
 	ifp->if_capabilities |= IFCAP_VLAN_HWTAGGING | IFCAP_VLAN_HWCSUM;
Index: sys/net/bpfq.h
===================================================================
--- sys/net/bpfq.h	(.../head)	(revision 0)
+++ sys/net/bpfq.h	(.../user/syuu/mq_bpf)	(revision 255200)
@@ -0,0 +1,32 @@
+#ifndef _NET_BPFQ_H_
+#define _NET_BPFQ_H_
+
+#include <sys/param.h>
+#include <sys/bitset.h>
+#include <sys/_bitset.h>
+
+#define BPFQ_BITS			256
+BITSET_DEFINE(bpf_qmask_bits, BPFQ_BITS);
+
+#define	BPFQ_CLR(n, p)			BIT_CLR(BPFQ_BITS, n, p)
+#define	BPFQ_COPY(f, t)			BIT_COPY(BPFQ_BITS, f, t)
+#define	BPFQ_ISSET(n, p)		BIT_ISSET(BPFQ_BITS, n, p)
+#define	BPFQ_SET(n, p)			BIT_SET(BPFQ_BITS, n, p)
+#define	BPFQ_ZERO(p) 			BIT_ZERO(BPFQ_BITS, p)
+#define	BPFQ_FILL(p) 			BIT_FILL(BPFQ_BITS, p)
+#define	BPFQ_SETOF(n, p)		BIT_SETOF(BPFQ_BITS, n, p)
+#define	BPFQ_EMPTY(p)			BIT_EMPTY(BPFQ_BITS, p)
+#define	BPFQ_ISFULLSET(p)		BIT_ISFULLSET(BPFQ_BITS, p)
+#define	BPFQ_SUBSET(p, c)		BIT_SUBSET(BPFQ_BITS, p, c)
+#define	BPFQ_OVERLAP(p, c)		BIT_OVERLAP(BPFQ_BITS, p, c)
+#define	BPFQ_CMP(p, c)			BIT_CMP(BPFQ_BITS, p, c)
+#define	BPFQ_OR(d, s)			BIT_OR(BPFQ_BITS, d, s)
+#define	BPFQ_AND(d, s)			BIT_AND(BPFQ_BITS, d, s)
+#define	BPFQ_NAND(d, s)			BIT_NAND(BPFQ_BITS, d, s)
+#define	BPFQ_CLR_ATOMIC(n, p)		BIT_CLR_ATOMIC(BPFQ_BITS, n, p)
+#define	BPFQ_SET_ATOMIC(n, p)		BIT_SET_ATOMIC(BPFQ_BITS, n, p)
+#define	BPFQ_OR_ATOMIC(d, s)		BIT_OR_ATOMIC(BPFQ_BITS, d, s)
+#define	BPFQ_COPY_STORE_REL(f, t)	BIT_COPY_STORE_REL(BPFQ_BITS, f, t)
+#define	BPFQ_FFS(p)			BIT_FFS(BPFQ_BITS, p)
+
+#endif
Index: sys/net/if.h
===================================================================
--- sys/net/if.h	(.../head)	(revision 255180)
+++ sys/net/if.h	(.../user/syuu/mq_bpf)	(revision 255200)
@@ -231,6 +231,7 @@
 #define	IFCAP_NETMAP		0x100000 /* netmap mode supported/enabled */
 #define	IFCAP_RXCSUM_IPV6	0x200000  /* can offload checksum on IPv6 RX */
 #define	IFCAP_TXCSUM_IPV6	0x400000  /* can offload checksum on IPv6 TX */
+#define	IFCAP_QUEUEID		0x800000  /* driver supports queueid notify */
 
 #define IFCAP_HWCSUM_IPV6	(IFCAP_RXCSUM_IPV6 | IFCAP_TXCSUM_IPV6)
 
Index: sys/net/bpf.c
===================================================================
--- sys/net/bpf.c	(.../head)	(revision 255180)
+++ sys/net/bpf.c	(.../user/syuu/mq_bpf)	(revision 255200)
@@ -819,6 +819,9 @@
 	size = d->bd_bufsize;
 	bpf_buffer_ioctl_sblen(d, &size);
 
+ 	d->bd_qmask.qm_enabled = FALSE;
+	BPFQ_LOCK_INIT(&d->bd_qmask);
+
 	return (0);
 }
 
@@ -1697,7 +1700,191 @@
 	case BIOCROTZBUF:
 		error = bpf_ioctl_rotzbuf(td, d, (struct bpf_zbuf *)addr);
 		break;
+
+	case BIOCQMASKENABLE:
+		{
+			struct ifnet *ifp;
+
+			if (d->bd_bif == NULL) {
+				/*
+				 * No interface attached yet.
+				 */
+				error = EINVAL;
+				break;
+			}
+			BPFQ_WLOCK(&d->bd_qmask);
+			if (d->bd_qmask.qm_enabled) {
+				BPFQ_WUNLOCK(&d->bd_qmask);
+				error = EINVAL;
+				break;
+			}
+			ifp = d->bd_bif->bif_ifp;
+			if (!(ifp->if_capabilities & IFCAP_QUEUEID)) {
+				BPFQ_WUNLOCK(&d->bd_qmask);
+				error = EINVAL;
+				break;
+			}
+			BPFQ_ZERO(&d->bd_qmask.qm_rxqmask);
+			BPFQ_ZERO(&d->bd_qmask.qm_txqmask);
+			d->bd_qmask.qm_noqmask = FALSE;
+			d->bd_qmask.qm_enabled = TRUE;
+			BPFQ_WUNLOCK(&d->bd_qmask);
+			break;
+		}
+
+	case BIOCQMASKDISABLE:
+		{
+			if (d->bd_bif == NULL) {
+				/*
+				 * No interface attached yet.
+				 */
+				error = EINVAL;
+				break;
+			}
+			BPFQ_WLOCK(&d->bd_qmask);
+			if (!d->bd_qmask.qm_enabled) {
+				BPFQ_WUNLOCK(&d->bd_qmask);
+				error = EINVAL;
+				break;
+			}
+			d->bd_qmask.qm_enabled = FALSE;
+			BPFQ_WUNLOCK(&d->bd_qmask);
+			break;
+		}
+
+	case BIOCGRXQMASK:
+		{
+			struct bpf_qmask_bits *qmask = (struct bpf_qmask_bits *)addr;
+
+			if (d->bd_bif == NULL) {
+				/*
+				 * No interface attached yet.
+				 */
+				error = EINVAL;
+				break;
+			}
+			BPFQ_WLOCK(&d->bd_qmask);
+			if (!d->bd_qmask.qm_enabled) {
+				BPFQ_WUNLOCK(&d->bd_qmask);
+				error = EINVAL;
+				break;
+			}
+			BPFQ_COPY(&d->bd_qmask.qm_rxqmask, qmask);
+			BPFQ_WUNLOCK(&d->bd_qmask);
+			break;
+
+		}
+
+	case BIOCSRXQMASK:
+		{
+			struct bpf_qmask_bits *qmask = (struct bpf_qmask_bits *)addr;
+
+			if (d->bd_bif == NULL) {
+				/*
+				 * No interface attached yet.
+				 */
+				error = EINVAL;	
+				break;
+			}
+			BPFQ_WLOCK(&d->bd_qmask);
+			if (!d->bd_qmask.qm_enabled) {
+				BPFQ_WUNLOCK(&d->bd_qmask);
+				error = EINVAL;
+				break;
+			}
+			BPFQ_COPY(qmask, &d->bd_qmask.qm_rxqmask);
+			BPFQ_WUNLOCK(&d->bd_qmask);
+			break;
+		}
+
+	case BIOCGTXQMASK:
+		{
+			struct bpf_qmask_bits *qmask = (struct bpf_qmask_bits *)addr;
+
+			if (d->bd_bif == NULL) {
+				/*
+				 * No interface attached yet.
+				 */
+				error = EINVAL;
+				break;
+			}
+			BPFQ_WLOCK(&d->bd_qmask);
+			if (!d->bd_qmask.qm_enabled) {
+				BPFQ_WUNLOCK(&d->bd_qmask);
+				error = EINVAL;
+				break;
+			}
+			BPFQ_COPY(&d->bd_qmask.qm_txqmask, qmask);
+			BPFQ_WUNLOCK(&d->bd_qmask);
+			break;
+		}
+
+	case BIOCSTXQMASK:
+		{
+			struct bpf_qmask_bits *qmask = (struct bpf_qmask_bits *)addr;
+
+			if (d->bd_bif == NULL) {
+				/*
+				 * No interface attached yet.
+				 */
+				error = EINVAL;	
+				break;
+			}
+			BPFQ_WLOCK(&d->bd_qmask);
+			if (!d->bd_qmask.qm_enabled) {
+				BPFQ_WUNLOCK(&d->bd_qmask);
+				error = EINVAL;
+				break;
+			}
+			BPFQ_COPY(qmask, &d->bd_qmask.qm_txqmask);
+			BPFQ_WUNLOCK(&d->bd_qmask);
+			break;
+		}
+
+	case BIOCGNOQMASK:
+		{
+			boolean_t *noqmask = (boolean_t *)addr;
+			if (d->bd_bif == NULL) {
+				/*
+				 * No interface attached yet.
+				 */
+				error = EINVAL;
+				break;
+			}
+			BPFQ_WLOCK(&d->bd_qmask);
+			if (!d->bd_qmask.qm_enabled) {
+				BPFQ_WUNLOCK(&d->bd_qmask);
+				error = EINVAL;
+				break;
+			}
+			*noqmask = d->bd_qmask.qm_noqmask;
+			BPFQ_WUNLOCK(&d->bd_qmask);
+			break;
+		}
+
+	case BIOCSNOQMASK:
+		{
+			boolean_t *noqmask = (boolean_t *)addr;
+
+			if (d->bd_bif == NULL) {
+				/*
+				 * No interface attached yet.
+				 */
+				error = EINVAL;	
+				break;
+			}
+			BPFQ_WLOCK(&d->bd_qmask);
+			if (!d->bd_qmask.qm_enabled) {
+				BPFQ_WUNLOCK(&d->bd_qmask);
+				error = EINVAL;
+				break;
+			}
+			d->bd_qmask.qm_noqmask = *noqmask;
+			BPFQ_WUNLOCK(&d->bd_qmask);
+			break;
+		}
 	}
+
 	CURVNET_RESTORE();
 	return (error);
 }
@@ -2043,6 +2230,15 @@
 	BPFIF_RLOCK(bp);
 
 	LIST_FOREACH(d, &bp->bif_dlist, bd_next) {
+ 		BPFQ_RLOCK(&d->bd_qmask);
+ 		if (d->bd_qmask.qm_enabled) {
+ 			if (!d->bd_qmask.qm_noqmask) {
+				BPFQ_RUNLOCK(&d->bd_qmask);
+ 				continue;
+ 			}
+ 		}
+		BPFQ_RUNLOCK(&d->bd_qmask);
+
 		/*
 		 * We are not using any locks for d here because:
 		 * 1) any filter change is protected by interface
@@ -2117,6 +2313,40 @@
 	BPFIF_RLOCK(bp);
 
 	LIST_FOREACH(d, &bp->bif_dlist, bd_next) {
+ 		BPFQ_RLOCK(&d->bd_qmask);
+ 		if (d->bd_qmask.qm_enabled) {
+ 			M_ASSERTPKTHDR(m);
+ 			if (m->m_flags & M_QUEUEID) {
+				switch (m->m_pkthdr.queuetype) {
+				case QUEUETYPE_RX:
+					if (!BPFQ_ISSET(m->m_pkthdr.queueid,
+						 &d->bd_qmask.qm_rxqmask)) {
+ 						BPFQ_RUNLOCK(&d->bd_qmask);
+ 						continue;
+ 					}
+					break;
+				case QUEUETYPE_TX:
+					if (!BPFQ_ISSET(m->m_pkthdr.queueid, 
+						&d->bd_qmask.qm_rxqmask)) {
+ 						BPFQ_RUNLOCK(&d->bd_qmask);
+ 						continue;
+ 					}
+					break;
+				default:
+					if (!d->bd_qmask.qm_noqmask) {
+						BPFQ_RUNLOCK(&d->bd_qmask);
+						continue;
+					}
+ 				}
+ 			}else{
+				if (!d->bd_qmask.qm_noqmask) {
+					BPFQ_RUNLOCK(&d->bd_qmask);
+					continue;
+				}
+			}
+ 		}
+ 		BPFQ_RUNLOCK(&d->bd_qmask);
+ 
 		if (BPF_CHECK_DIRECTION(d, m->m_pkthdr.rcvif, bp->bif_ifp))
 			continue;
 		++d->bd_rcount;
@@ -2180,6 +2410,40 @@
 	BPFIF_RLOCK(bp);
 
 	LIST_FOREACH(d, &bp->bif_dlist, bd_next) {
+  		BPFQ_RLOCK(&d->bd_qmask);
+ 		if (d->bd_qmask.qm_enabled) {
+ 			M_ASSERTPKTHDR(m);
+ 			if (m->m_flags & M_QUEUEID) {
+				switch (m->m_pkthdr.queuetype) {
+				case QUEUETYPE_RX:
+					if (!BPFQ_ISSET(m->m_pkthdr.queueid,
+						 &d->bd_qmask.qm_rxqmask)) {
+ 						BPFQ_RUNLOCK(&d->bd_qmask);
+ 						continue;
+ 					}
+					break;
+				case QUEUETYPE_TX:
+					if (!BPFQ_ISSET(m->m_pkthdr.queueid, 
+						&d->bd_qmask.qm_rxqmask)) {
+ 						BPFQ_RUNLOCK(&d->bd_qmask);
+ 						continue;
+ 					}
+					break;
+				default:
+					if (!d->bd_qmask.qm_noqmask) {
+						BPFQ_RUNLOCK(&d->bd_qmask);
+						continue;
+					}
+ 				}
+ 			}else{
+				if (!d->bd_qmask.qm_noqmask) {
+					BPFQ_RUNLOCK(&d->bd_qmask);
+					continue;
+				}
+			}
+ 		}
+ 		BPFQ_RUNLOCK(&d->bd_qmask);
+
 		if (BPF_CHECK_DIRECTION(d, m->m_pkthdr.rcvif, bp->bif_ifp))
 			continue;
 		++d->bd_rcount;
Index: sys/net/bpfdesc.h
===================================================================
--- sys/net/bpfdesc.h	(.../head)	(revision 255180)
+++ sys/net/bpfdesc.h	(.../user/syuu/mq_bpf)	(revision 255200)
@@ -44,7 +44,23 @@
 #include <sys/queue.h>
 #include <sys/conf.h>
 #include <net/if.h>
+#include <net/bpfq.h>
 
+struct bpf_qmask {
+	int		qm_enabled;
+	struct bpf_qmask_bits qm_rxqmask;
+	struct bpf_qmask_bits qm_txqmask;
+	int		qm_noqmask;
+	struct rwlock	qm_lock;
+};
+
+#define BPFQ_LOCK_INIT(qm)	rw_init(&(qm)->qm_lock, "qmask lock")
+#define BPFQ_LOCK_DESTROY(qm)	rw_destroy(&(qm)->qm_lock)
+#define BPFQ_RLOCK(qm)		rw_rlock(&(qm)->qm_lock)
+#define BPFQ_RUNLOCK(qm)	rw_runlock(&(qm)->qm_lock)
+#define BPFQ_WLOCK(qm)		rw_wlock(&(qm)->qm_lock)
+#define BPFQ_WUNLOCK(qm)	rw_wunlock(&(qm)->qm_lock)
+
 /*
  * Descriptor associated with each open bpf file.
  */
@@ -101,6 +117,7 @@
 	u_int64_t	bd_wdcount;	/* number of packets dropped during a write */
 	u_int64_t	bd_zcopy;	/* number of zero copy operations */
 	u_char		bd_compat32;	/* 32-bit stream on LP64 system */
+	struct bpf_qmask bd_qmask;
 };
 
 /* Values for bd_state */
Index: sys/net/bpf.h
===================================================================
--- sys/net/bpf.h	(.../head)	(revision 255180)
+++ sys/net/bpf.h	(.../user/syuu/mq_bpf)	(revision 255200)
@@ -40,6 +40,8 @@
 #ifndef _NET_BPF_H_
 #define _NET_BPF_H_
 
+#include <net/bpfq.h>
+
 /* BSD style release date */
 #define	BPF_RELEASE 199606
 
@@ -147,6 +149,14 @@
 #define	BIOCSETFNR	_IOW('B', 130, struct bpf_program)
 #define	BIOCGTSTAMP	_IOR('B', 131, u_int)
 #define	BIOCSTSTAMP	_IOW('B', 132, u_int)
+#define	BIOCQMASKENABLE	_IO('B', 133)
+#define	BIOCQMASKDISABLE _IO('B', 134)
+#define	BIOCGRXQMASK	_IOR('B', 135, struct bpf_qmask_bits)
+#define	BIOCSRXQMASK	_IOW('B', 135, struct bpf_qmask_bits)
+#define	BIOCGTXQMASK	_IOR('B', 136, struct bpf_qmask_bits)
+#define	BIOCSTXQMASK	_IOW('B', 137, struct bpf_qmask_bits)
+#define	BIOCGNOQMASK	_IOR('B', 138, int)
+#define	BIOCSNOQMASK	_IOW('B', 139, int)
 
 /* Obsolete */
 #define	BIOCGSEESENT	BIOCGDIRECTION
Index: sys/sys/mbuf.h
===================================================================
--- sys/sys/mbuf.h	(.../head)	(revision 255180)
+++ sys/sys/mbuf.h	(.../user/syuu/mq_bpf)	(revision 255200)
@@ -114,6 +114,11 @@
 	void			(*m_tag_free)(struct m_tag *);
 };
 
+enum queuetype {
+	QUEUETYPE_RX,
+	QUEUETYPE_TX
+};
+
 /*
  * Record/packet header in first mbuf of chain; valid only if M_PKTHDR is set.
  * Size ILP32: 48
@@ -126,6 +131,8 @@
 
 	/* Layer crossing persistent information. */
 	uint32_t	 flowid;	/* packet's 4-tuple system */
+	uint32_t	 queueid;	/* hw queue id */
+	uint32_t	 queuetype;	/* hw queue type */
 	uint64_t	 csum_flags;	/* checksum and offload features */
 	uint16_t	 fibnum;	/* this packet should use this fib */
 	uint8_t		 cosqos;	/* class/quality of service */
@@ -223,6 +230,7 @@
 #define	M_VLANTAG	0x00000080 /* ether_vtag is valid */
 #define	M_FLOWID	0x00000100 /* deprecated: flowid is valid */
 #define	M_NOFREE	0x00000200 /* do not free mbuf, embedded in cluster */
+#define	M_QUEUEID	0x00000400 /* packet has hw queue id */
 
 #define	M_PROTO1	0x00001000 /* protocol-specific */
 #define	M_PROTO2	0x00002000 /* protocol-specific */
Index: sbin/ifconfig/ifconfig.c
===================================================================
--- sbin/ifconfig/ifconfig.c	(.../head)	(revision 255180)
+++ sbin/ifconfig/ifconfig.c	(.../user/syuu/mq_bpf)	(revision 255200)
@@ -917,7 +917,7 @@
 "\020\1RXCSUM\2TXCSUM\3NETCONS\4VLAN_MTU\5VLAN_HWTAGGING\6JUMBO_MTU\7POLLING" \
 "\10VLAN_HWCSUM\11TSO4\12TSO6\13LRO\14WOL_UCAST\15WOL_MCAST\16WOL_MAGIC" \
 "\17TOE4\20TOE6\21VLAN_HWFILTER\23VLAN_HWTSO\24LINKSTATE\25NETMAP" \
-"\26RXCSUM_IPV6\27TXCSUM_IPV6"
+"\26RXCSUM_IPV6\27TXCSUM_IPV6\28QUEUEID"
 
 /*
  * Print the status of the interface.  If an address family was

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CALG4x-UVgT9aXSBzrjDxeCD-f6Yo_TBeRsqjXapz2iyr_=tCLw>