Date: Wed, 4 Sep 2013 17:04:23 +0900 From: Takuya ASADA <syuu@dokukino.com> To: Luigi Rizzo <rizzo@iet.unipi.it> Cc: FreeBSD Net <freebsd-net@freebsd.org> Subject: Re: Multiqueue support for bpf Message-ID: <CALG4x-UVgT9aXSBzrjDxeCD-f6Yo_TBeRsqjXapz2iyr_=tCLw@mail.gmail.com> In-Reply-To: <CA%2BhQ2%2Bi_qu7RouPW%2Bihfb5nL_1SQWMpFxTpoHfhaCvhtS8-EHQ@mail.gmail.com> References: <CALG4x-V-OLoqMXQarSNy5Lv3kNVu01AiN4A49Nv7t-Ysfr1DBg@mail.gmail.com> <CA%2BhQ2%2BgwW6FOQS79xmWVLSWWHrZMFnhaUM98Kp6aDVaUePNfTA@mail.gmail.com> <CALG4x-UYBFsMttpZx1-c_wtVf5MST8%2B_t1psY2HQskTiOZDFLA@mail.gmail.com> <CA%2BhQ2%2Bi_qu7RouPW%2Bihfb5nL_1SQWMpFxTpoHfhaCvhtS8-EHQ@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --] Hi, This is 2nd version of multiqueue bpf patch, I think I fixed things what you commented on previous mail. Here's a change list of the patch: - Drop added functions on struct ifnet(if_get_[rt]xqueue_len/if_get_[rt]xqueue_affinity). HW queue number and queue affinity informations are maybe useful for some applications, but it's not really directly related to multiqueue bpf. I think we should discuss them separately. - Use BITSET for queue mask. It seems better to use BITSET for queue mask structure, instead of boolean array. - Drop tcpdump/libpcap changes. It also should discuss separately. - M_QUEUEID/IFCAP_QUEUEID M_QUEUEID is the flag for mbuf which contains hw queue id. IFCAP_QUEUEID is the flag which means the driver has ability to set queue id on mbuf. 2013/7/3 Luigi Rizzo <rizzo@iet.unipi.it> > > > > On Tue, Jul 2, 2013 at 5:56 PM, Takuya ASADA <syuu@dokukino.com> wrote: > >> Hi, >> >> Do you have an updated URL for the diffs ? The link below from your >>> original message >>> seems not working now (NXDOMAIN) >>> >>> http://www.dokukino.com/mq_bpf_20110813.diff >>> >> >> Changes with recent head is on my repository: >> http://svnweb.freebsd.org/base/user/syuu/mq_bpf/ >> And I attached a diff file on this mail. >> >> > thanks for the diffs (the URL to the repo is useful too, > but a URL to generate diffs is more convenient for reviewing changes). > > I believe it still needs a bit of work before being merged. > > My comments (in order of the patch): > > === ifnet.9 (and related code in if.c, sockio.h) === > - if_get_rxqueue_len()/if_get_rxqueue_len() is not a good name, > as to me at least it suggests that it returns the size of the > individual queue, rather than the number of queues. > > - cpu affinity (in userspace) is a bitmask, whereas in the BSD kernel > we almost never use the term "affinity", and favour "couid" or "oncpu" > (i.e. an individual CPU id). > I think you should either rename if_get_txqueue_affinity(), or make > the return type a cpuset (which seems more sensible as the return > value is passed to userspace) > > === bpf.4 (and related code) === > > - the ioctl() to attach/detach/report queues attached to a specific > bpf descriptor talk about "mask bit" but neither the internal nor > the external implementation deal with bits. > I'd rather document those ioctl as "attaching queue to file descriptor". > > - the BPF ioctl names are generally inconsistent (using either S or SET > and G or GET for the setter and getter methods). > But you should pick one of the patterns and stick with it, > not introduce a third variant (GT/ST). > Given we are in 2013 we might go for the long form GET and SET > so i suggest the following (spaces for clarity) > > +#define BIOC ENA QMASK _IO('B', 133) > +#define BIOC DIS QMASK _IO('B', 134) > +#define BIOC SET RXQMASK _IOWR('B', 135, uint32_t) > +#define BIOC CLR RXQMASK _IOWR('B', 136, uint32_t) > +#define BIOC GET RXQMASK _IOR('B', 137, uint32_t) > +#define BIOC SET TXQMASK _IOWR('B', 138, uint32_t) > +#define BIOC CLR TXQMASK _IOWR('B', 139, uint32_t) > +#define BIOC GET TXQMASK _IOR('B', 140, uint32_t) > +#define BIOC SET OTHERMASK _IO('B', 141) > +#define BIOC CLR OTHERMASK _IO('B', 142) > +#define BIOC GET OTHERMASK _IOR('B', 143, uint32_t) > > Also related: the existing ioctls() use u_int as argumnts, rather > than uint32_t. I personally prefer the uint32_t form, but you > should at least add a comment to indicate that the choice is > deliberate. > > === if.c === > > > - you have a KASSERT to report if ifp->if_get_*xqueue_affinity() is not > set, but i'd rather run the function only if is set, so you can > have a multiqueue interface which does not bind queues to specific cores > (which i am not sure is always a great idea; too many processes > statically bound to the same queue mean you lose opportunity to > parallelize work.) > > === mbuf.h === > > as mentioned earlier, the modification to struct mbuf should > be avoided if possible at all. It seems that you need just one > direction bit (which maybe is available already from the context) > and one queue identifier, which in the rx path, at least in your > implementation is always a replica of the 'flowid' field. > Can you see if perhaps the flowid field can be (ab)used on the > tx path as well ? > > > === if.h === > > - in if.h, can you use individual variables instead of arrays > for ifr_queue_affinity_index and friends ? > The macros to map the fields of ifr_ifru one > level up are a necessary evil, > but there is no point in using the arrays. > > - SIOCGIFTXQAFFINITY seems to use the receive function (copy&paste typo) > talks about > Also, this function is probably something that should be coordinated > with work on generic multiqueue support > > > === bpf.c === > > - in linux (and hopefully in FreeBSD at some point) the number of queues > can be changed at runtime. > So i suggest that you cache the current number of queues when > you allocate the arrays (qm_*xq_qmask[] ) rather than invoking > ifp->if_get_*xqueue_len() everytime you need to do a boundary check. > This will save us from all sort of problems later. > > - in terms of code, the six BIOC*XQMASK are very similar, you are probably > better off having one single case in the switch > > - can you some comments in the code for the chunk at @@ -2117,6 +2391,42 @@ > I do not completely understand why you are returning if the *queue tag > in the mbuf is out of range (my impression is that you should > just continue, or if you think the packet is incorrect it should > be filtered out before entering the LIST_FOREACH() ). > Secondly, you should use the cached value of *queue_len > > > > cheers > luigi > > > -- > -----------------------------------------+------------------------------- > Prof. Luigi RIZZO, rizzo@iet.unipi.it . Dip. di Ing. dell'Informazione > http://www.iet.unipi.it/~luigi/ . Universita` di Pisa > TEL +39-050-2211611 . via Diotisalvi 2 > Mobile +39-338-6809875 . 56122 PISA (Italy) > > -----------------------------------------+------------------------------- > > [-- Attachment #2 --] Index: share/man/man4/bpf.4 =================================================================== --- share/man/man4/bpf.4 (.../head) (revision 255180) +++ share/man/man4/bpf.4 (.../user/syuu/mq_bpf) (revision 255200) @@ -631,6 +631,36 @@ .Vt bzh_kernel_gen against .Vt bzh_user_gen . +.It Dv BIOCQMASKENABLE +Enables multiqueue filter on the descriptor. + +.It Dv BIOCQMASKDISABLE +Disables multiqueue filter on the descriptor. + +.It Dv BIOCGRXQMASK +.Pq Li struct bpf_qmask_bits +Set RX queue mask bits. + +.It Dv BIOCSRXQMASK +.Pq Li struct bpf_qmask_bits +Get RX queue mask bits. + +.It Dv BIOCGTXQMASK +.Pq Li struct bpf_qmask_bits +Set TX queue mask bits. + +.It Dv BIOCSTXQMASK +.Pq Li struct bpf_qmask_bits +Get TX queue mask bits. + +.It Dv BIOCGNOQMASK +.Pq Li int +Set mask bit for the packets which not tied with any queues. + +.It Dv BIOCSNOQMASK +.Pq Li int +Get mask bit for the packets which not tied with any queues. + .El .Sh BPF HEADER One of the following structures is prepended to each packet returned by @@ -1037,6 +1067,23 @@ BPF_STMT(BPF_RET+BPF_K, 0), }; .Ed +.Sh MULTIQUEUE SUPPORT +Multiqueue network interface support function provides interfaces for +multithreaded packet processing using bpf. + +Normal bpf can receive packets from specified interface, multiqueue support +function can receive packets from specified hardware queue. + +This distributes bpf workload on multiple threads, also reduces lock +contention on bpf. + +To make your program multithreaded, you'll need to open bpf descriptor on each +thread, enable multiqueue support by BIOCQMASKENABLE ioctl, and set queue mask by BIOCSRXQMASK / BIOCSTXQMASK / BIOCSNOQMASK ioctls. + +Queue length and queue affinity information may useful to optimize setting +queue mask on bpf descriptor, see +.Xr netintro 4 . + .Sh SEE ALSO .Xr tcpdump 1 , .Xr ioctl 2 , Index: sys/dev/ixgbe/ixgbe.c =================================================================== --- sys/dev/ixgbe/ixgbe.c (.../head) (revision 255180) +++ sys/dev/ixgbe/ixgbe.c (.../user/syuu/mq_bpf) (revision 255200) @@ -751,6 +751,11 @@ IFQ_DRV_PREPEND(&ifp->if_snd, m_head); break; } + + m_head->m_flags |= M_QUEUEID; + m_head->m_pkthdr.queueid = txr->me; + m_head->m_pkthdr.queuetype = QUEUETYPE_TX; + /* Send a copy of the frame to the BPF listener */ ETHER_BPF_MTAP(ifp, m_head); @@ -849,6 +854,11 @@ drbr_advance(ifp, txr->br); #endif enqueued++; + + next->m_flags |= M_QUEUEID; + next->m_pkthdr.queueid = txr->me; + next->m_pkthdr.queuetype = QUEUETYPE_TX; + /* Send a copy of the frame to the BPF listener */ ETHER_BPF_MTAP(ifp, next); if ((ifp->if_drv_flags & IFF_DRV_RUNNING) == 0) @@ -2637,6 +2647,7 @@ ifp->if_capabilities |= IFCAP_VLAN_HWTAGGING | IFCAP_VLAN_HWTSO | IFCAP_VLAN_MTU; + ifp->if_capabilities |= IFCAP_QUEUEID; ifp->if_capenable = ifp->if_capabilities; /* @@ -4546,8 +4557,10 @@ ixgbe_rx_checksum(staterr, sendmp, ptype); #if __FreeBSD_version >= 800000 sendmp->m_pkthdr.flowid = que->msix; - sendmp->m_flags |= M_FLOWID; + sendmp->m_flags |= (M_FLOWID | M_QUEUEID); #endif + sendmp->m_pkthdr.queueid = que->msix; + sendmp->m_pkthdr.queuetype = QUEUETYPE_RX; } next_desc: bus_dmamap_sync(rxr->rxdma.dma_tag, rxr->rxdma.dma_map, Index: sys/dev/e1000/if_igb.c =================================================================== --- sys/dev/e1000/if_igb.c (.../head) (revision 255180) +++ sys/dev/e1000/if_igb.c (.../user/syuu/mq_bpf) (revision 255200) @@ -920,6 +920,10 @@ break; } + m_head->m_flags |= M_QUEUEID; + m_head->m_pkthdr.queueid = txr->me; + m_head->m_pkthdr.queuetype = QUEUETYPE_TX; + /* Send a copy of the frame to the BPF listener */ ETHER_BPF_MTAP(ifp, m_head); @@ -1019,6 +1023,9 @@ ifp->if_obytes += next->m_pkthdr.len; if (next->m_flags & M_MCAST) ifp->if_omcasts++; + next->m_flags |= M_QUEUEID; + next->m_pkthdr.queueid = txr->me; + next->m_pkthdr.queuetype = QUEUETYPE_TX; ETHER_BPF_MTAP(ifp, next); if ((ifp->if_drv_flags & IFF_DRV_RUNNING) == 0) break; @@ -3169,6 +3176,7 @@ ** enable this and get full hardware tag filtering. */ ifp->if_capabilities |= IFCAP_VLAN_HWFILTER; + ifp->if_capabilities |= IFCAP_QUEUEID; /* * Specify the media types supported by this adapter and register @@ -4891,8 +4899,11 @@ } #ifndef IGB_LEGACY_TX rxr->fmp->m_pkthdr.flowid = que->msix; - rxr->fmp->m_flags |= M_FLOWID; + rxr->fmp->m_flags |= (M_FLOWID | M_QUEUEID); #endif + rxr->fmp->m_pkthdr.queueid = que->msix; + rxr->fmp->m_pkthdr.queuetype = QUEUETYPE_TX; + sendmp = rxr->fmp; /* Make sure to set M_PKTHDR. */ sendmp->m_flags |= M_PKTHDR; Index: sys/dev/mxge/if_mxge.c =================================================================== --- sys/dev/mxge/if_mxge.c (.../head) (revision 255180) +++ sys/dev/mxge/if_mxge.c (.../user/syuu/mq_bpf) (revision 255200) @@ -2272,6 +2272,10 @@ if (m == NULL) { return; } + m->m_flags |= M_QUEUEID; + m->m_pkthdr.queueid = (ss - sc->ss); + m->m_pkthdr.queuetype = QUEUETYPE_TX; + /* let BPF see it */ BPF_MTAP(ifp, m); @@ -2306,6 +2310,10 @@ if (!drbr_needs_enqueue(ifp, tx->br) && ((tx->mask - (tx->req - tx->done)) > tx->max_desc)) { + m->m_flags |= M_QUEUEID; + m->m_pkthdr.queueid = (ss - sc->ss); + m->m_pkthdr.queuetype = QUEUETYPE_TX; + /* let BPF see it */ BPF_MTAP(ifp, m); /* give it to the nic */ @@ -2718,7 +2726,9 @@ /* flowid only valid if RSS hashing is enabled */ if (sc->num_slices > 1) { m->m_pkthdr.flowid = (ss - sc->ss); - m->m_flags |= M_FLOWID; + m->m_flags |= (M_FLOWID | M_QUEUEID); + m->m_pkthdr.queueid = (ss - sc->ss); + m->m_pkthdr.queuetype = QUEUETYPE_RX; } /* pass the frame up the stack */ (*ifp->if_input)(ifp, m); @@ -4893,6 +4903,7 @@ #if defined(INET) || defined(INET6) ifp->if_capabilities |= IFCAP_LRO; #endif + ifp->if_capabilities |= IFCAP_QUEUEID; #ifdef MXGE_NEW_VLAN_API ifp->if_capabilities |= IFCAP_VLAN_HWTAGGING | IFCAP_VLAN_HWCSUM; Index: sys/net/bpfq.h =================================================================== --- sys/net/bpfq.h (.../head) (revision 0) +++ sys/net/bpfq.h (.../user/syuu/mq_bpf) (revision 255200) @@ -0,0 +1,32 @@ +#ifndef _NET_BPFQ_H_ +#define _NET_BPFQ_H_ + +#include <sys/param.h> +#include <sys/bitset.h> +#include <sys/_bitset.h> + +#define BPFQ_BITS 256 +BITSET_DEFINE(bpf_qmask_bits, BPFQ_BITS); + +#define BPFQ_CLR(n, p) BIT_CLR(BPFQ_BITS, n, p) +#define BPFQ_COPY(f, t) BIT_COPY(BPFQ_BITS, f, t) +#define BPFQ_ISSET(n, p) BIT_ISSET(BPFQ_BITS, n, p) +#define BPFQ_SET(n, p) BIT_SET(BPFQ_BITS, n, p) +#define BPFQ_ZERO(p) BIT_ZERO(BPFQ_BITS, p) +#define BPFQ_FILL(p) BIT_FILL(BPFQ_BITS, p) +#define BPFQ_SETOF(n, p) BIT_SETOF(BPFQ_BITS, n, p) +#define BPFQ_EMPTY(p) BIT_EMPTY(BPFQ_BITS, p) +#define BPFQ_ISFULLSET(p) BIT_ISFULLSET(BPFQ_BITS, p) +#define BPFQ_SUBSET(p, c) BIT_SUBSET(BPFQ_BITS, p, c) +#define BPFQ_OVERLAP(p, c) BIT_OVERLAP(BPFQ_BITS, p, c) +#define BPFQ_CMP(p, c) BIT_CMP(BPFQ_BITS, p, c) +#define BPFQ_OR(d, s) BIT_OR(BPFQ_BITS, d, s) +#define BPFQ_AND(d, s) BIT_AND(BPFQ_BITS, d, s) +#define BPFQ_NAND(d, s) BIT_NAND(BPFQ_BITS, d, s) +#define BPFQ_CLR_ATOMIC(n, p) BIT_CLR_ATOMIC(BPFQ_BITS, n, p) +#define BPFQ_SET_ATOMIC(n, p) BIT_SET_ATOMIC(BPFQ_BITS, n, p) +#define BPFQ_OR_ATOMIC(d, s) BIT_OR_ATOMIC(BPFQ_BITS, d, s) +#define BPFQ_COPY_STORE_REL(f, t) BIT_COPY_STORE_REL(BPFQ_BITS, f, t) +#define BPFQ_FFS(p) BIT_FFS(BPFQ_BITS, p) + +#endif Index: sys/net/if.h =================================================================== --- sys/net/if.h (.../head) (revision 255180) +++ sys/net/if.h (.../user/syuu/mq_bpf) (revision 255200) @@ -231,6 +231,7 @@ #define IFCAP_NETMAP 0x100000 /* netmap mode supported/enabled */ #define IFCAP_RXCSUM_IPV6 0x200000 /* can offload checksum on IPv6 RX */ #define IFCAP_TXCSUM_IPV6 0x400000 /* can offload checksum on IPv6 TX */ +#define IFCAP_QUEUEID 0x800000 /* driver supports queueid notify */ #define IFCAP_HWCSUM_IPV6 (IFCAP_RXCSUM_IPV6 | IFCAP_TXCSUM_IPV6) Index: sys/net/bpf.c =================================================================== --- sys/net/bpf.c (.../head) (revision 255180) +++ sys/net/bpf.c (.../user/syuu/mq_bpf) (revision 255200) @@ -819,6 +819,9 @@ size = d->bd_bufsize; bpf_buffer_ioctl_sblen(d, &size); + d->bd_qmask.qm_enabled = FALSE; + BPFQ_LOCK_INIT(&d->bd_qmask); + return (0); } @@ -1697,7 +1700,191 @@ case BIOCROTZBUF: error = bpf_ioctl_rotzbuf(td, d, (struct bpf_zbuf *)addr); break; + + case BIOCQMASKENABLE: + { + struct ifnet *ifp; + + if (d->bd_bif == NULL) { + /* + * No interface attached yet. + */ + error = EINVAL; + break; + } + BPFQ_WLOCK(&d->bd_qmask); + if (d->bd_qmask.qm_enabled) { + BPFQ_WUNLOCK(&d->bd_qmask); + error = EINVAL; + break; + } + ifp = d->bd_bif->bif_ifp; + if (!(ifp->if_capabilities & IFCAP_QUEUEID)) { + BPFQ_WUNLOCK(&d->bd_qmask); + error = EINVAL; + break; + } + BPFQ_ZERO(&d->bd_qmask.qm_rxqmask); + BPFQ_ZERO(&d->bd_qmask.qm_txqmask); + d->bd_qmask.qm_noqmask = FALSE; + d->bd_qmask.qm_enabled = TRUE; + BPFQ_WUNLOCK(&d->bd_qmask); + break; + } + + case BIOCQMASKDISABLE: + { + if (d->bd_bif == NULL) { + /* + * No interface attached yet. + */ + error = EINVAL; + break; + } + BPFQ_WLOCK(&d->bd_qmask); + if (!d->bd_qmask.qm_enabled) { + BPFQ_WUNLOCK(&d->bd_qmask); + error = EINVAL; + break; + } + d->bd_qmask.qm_enabled = FALSE; + BPFQ_WUNLOCK(&d->bd_qmask); + break; + } + + case BIOCGRXQMASK: + { + struct bpf_qmask_bits *qmask = (struct bpf_qmask_bits *)addr; + + if (d->bd_bif == NULL) { + /* + * No interface attached yet. + */ + error = EINVAL; + break; + } + BPFQ_WLOCK(&d->bd_qmask); + if (!d->bd_qmask.qm_enabled) { + BPFQ_WUNLOCK(&d->bd_qmask); + error = EINVAL; + break; + } + BPFQ_COPY(&d->bd_qmask.qm_rxqmask, qmask); + BPFQ_WUNLOCK(&d->bd_qmask); + break; + + } + + case BIOCSRXQMASK: + { + struct bpf_qmask_bits *qmask = (struct bpf_qmask_bits *)addr; + + if (d->bd_bif == NULL) { + /* + * No interface attached yet. + */ + error = EINVAL; + break; + } + BPFQ_WLOCK(&d->bd_qmask); + if (!d->bd_qmask.qm_enabled) { + BPFQ_WUNLOCK(&d->bd_qmask); + error = EINVAL; + break; + } + BPFQ_COPY(qmask, &d->bd_qmask.qm_rxqmask); + BPFQ_WUNLOCK(&d->bd_qmask); + break; + } + + case BIOCGTXQMASK: + { + struct bpf_qmask_bits *qmask = (struct bpf_qmask_bits *)addr; + + if (d->bd_bif == NULL) { + /* + * No interface attached yet. + */ + error = EINVAL; + break; + } + BPFQ_WLOCK(&d->bd_qmask); + if (!d->bd_qmask.qm_enabled) { + BPFQ_WUNLOCK(&d->bd_qmask); + error = EINVAL; + break; + } + BPFQ_COPY(&d->bd_qmask.qm_txqmask, qmask); + BPFQ_WUNLOCK(&d->bd_qmask); + break; + } + + case BIOCSTXQMASK: + { + struct bpf_qmask_bits *qmask = (struct bpf_qmask_bits *)addr; + + if (d->bd_bif == NULL) { + /* + * No interface attached yet. + */ + error = EINVAL; + break; + } + BPFQ_WLOCK(&d->bd_qmask); + if (!d->bd_qmask.qm_enabled) { + BPFQ_WUNLOCK(&d->bd_qmask); + error = EINVAL; + break; + } + BPFQ_COPY(qmask, &d->bd_qmask.qm_txqmask); + BPFQ_WUNLOCK(&d->bd_qmask); + break; + } + + case BIOCGNOQMASK: + { + boolean_t *noqmask = (boolean_t *)addr; + if (d->bd_bif == NULL) { + /* + * No interface attached yet. + */ + error = EINVAL; + break; + } + BPFQ_WLOCK(&d->bd_qmask); + if (!d->bd_qmask.qm_enabled) { + BPFQ_WUNLOCK(&d->bd_qmask); + error = EINVAL; + break; + } + *noqmask = d->bd_qmask.qm_noqmask; + BPFQ_WUNLOCK(&d->bd_qmask); + break; + } + + case BIOCSNOQMASK: + { + boolean_t *noqmask = (boolean_t *)addr; + + if (d->bd_bif == NULL) { + /* + * No interface attached yet. + */ + error = EINVAL; + break; + } + BPFQ_WLOCK(&d->bd_qmask); + if (!d->bd_qmask.qm_enabled) { + BPFQ_WUNLOCK(&d->bd_qmask); + error = EINVAL; + break; + } + d->bd_qmask.qm_noqmask = *noqmask; + BPFQ_WUNLOCK(&d->bd_qmask); + break; + } } + CURVNET_RESTORE(); return (error); } @@ -2043,6 +2230,15 @@ BPFIF_RLOCK(bp); LIST_FOREACH(d, &bp->bif_dlist, bd_next) { + BPFQ_RLOCK(&d->bd_qmask); + if (d->bd_qmask.qm_enabled) { + if (!d->bd_qmask.qm_noqmask) { + BPFQ_RUNLOCK(&d->bd_qmask); + continue; + } + } + BPFQ_RUNLOCK(&d->bd_qmask); + /* * We are not using any locks for d here because: * 1) any filter change is protected by interface @@ -2117,6 +2313,40 @@ BPFIF_RLOCK(bp); LIST_FOREACH(d, &bp->bif_dlist, bd_next) { + BPFQ_RLOCK(&d->bd_qmask); + if (d->bd_qmask.qm_enabled) { + M_ASSERTPKTHDR(m); + if (m->m_flags & M_QUEUEID) { + switch (m->m_pkthdr.queuetype) { + case QUEUETYPE_RX: + if (!BPFQ_ISSET(m->m_pkthdr.queueid, + &d->bd_qmask.qm_rxqmask)) { + BPFQ_RUNLOCK(&d->bd_qmask); + continue; + } + break; + case QUEUETYPE_TX: + if (!BPFQ_ISSET(m->m_pkthdr.queueid, + &d->bd_qmask.qm_rxqmask)) { + BPFQ_RUNLOCK(&d->bd_qmask); + continue; + } + break; + default: + if (!d->bd_qmask.qm_noqmask) { + BPFQ_RUNLOCK(&d->bd_qmask); + continue; + } + } + }else{ + if (!d->bd_qmask.qm_noqmask) { + BPFQ_RUNLOCK(&d->bd_qmask); + continue; + } + } + } + BPFQ_RUNLOCK(&d->bd_qmask); + if (BPF_CHECK_DIRECTION(d, m->m_pkthdr.rcvif, bp->bif_ifp)) continue; ++d->bd_rcount; @@ -2180,6 +2410,40 @@ BPFIF_RLOCK(bp); LIST_FOREACH(d, &bp->bif_dlist, bd_next) { + BPFQ_RLOCK(&d->bd_qmask); + if (d->bd_qmask.qm_enabled) { + M_ASSERTPKTHDR(m); + if (m->m_flags & M_QUEUEID) { + switch (m->m_pkthdr.queuetype) { + case QUEUETYPE_RX: + if (!BPFQ_ISSET(m->m_pkthdr.queueid, + &d->bd_qmask.qm_rxqmask)) { + BPFQ_RUNLOCK(&d->bd_qmask); + continue; + } + break; + case QUEUETYPE_TX: + if (!BPFQ_ISSET(m->m_pkthdr.queueid, + &d->bd_qmask.qm_rxqmask)) { + BPFQ_RUNLOCK(&d->bd_qmask); + continue; + } + break; + default: + if (!d->bd_qmask.qm_noqmask) { + BPFQ_RUNLOCK(&d->bd_qmask); + continue; + } + } + }else{ + if (!d->bd_qmask.qm_noqmask) { + BPFQ_RUNLOCK(&d->bd_qmask); + continue; + } + } + } + BPFQ_RUNLOCK(&d->bd_qmask); + if (BPF_CHECK_DIRECTION(d, m->m_pkthdr.rcvif, bp->bif_ifp)) continue; ++d->bd_rcount; Index: sys/net/bpfdesc.h =================================================================== --- sys/net/bpfdesc.h (.../head) (revision 255180) +++ sys/net/bpfdesc.h (.../user/syuu/mq_bpf) (revision 255200) @@ -44,7 +44,23 @@ #include <sys/queue.h> #include <sys/conf.h> #include <net/if.h> +#include <net/bpfq.h> +struct bpf_qmask { + int qm_enabled; + struct bpf_qmask_bits qm_rxqmask; + struct bpf_qmask_bits qm_txqmask; + int qm_noqmask; + struct rwlock qm_lock; +}; + +#define BPFQ_LOCK_INIT(qm) rw_init(&(qm)->qm_lock, "qmask lock") +#define BPFQ_LOCK_DESTROY(qm) rw_destroy(&(qm)->qm_lock) +#define BPFQ_RLOCK(qm) rw_rlock(&(qm)->qm_lock) +#define BPFQ_RUNLOCK(qm) rw_runlock(&(qm)->qm_lock) +#define BPFQ_WLOCK(qm) rw_wlock(&(qm)->qm_lock) +#define BPFQ_WUNLOCK(qm) rw_wunlock(&(qm)->qm_lock) + /* * Descriptor associated with each open bpf file. */ @@ -101,6 +117,7 @@ u_int64_t bd_wdcount; /* number of packets dropped during a write */ u_int64_t bd_zcopy; /* number of zero copy operations */ u_char bd_compat32; /* 32-bit stream on LP64 system */ + struct bpf_qmask bd_qmask; }; /* Values for bd_state */ Index: sys/net/bpf.h =================================================================== --- sys/net/bpf.h (.../head) (revision 255180) +++ sys/net/bpf.h (.../user/syuu/mq_bpf) (revision 255200) @@ -40,6 +40,8 @@ #ifndef _NET_BPF_H_ #define _NET_BPF_H_ +#include <net/bpfq.h> + /* BSD style release date */ #define BPF_RELEASE 199606 @@ -147,6 +149,14 @@ #define BIOCSETFNR _IOW('B', 130, struct bpf_program) #define BIOCGTSTAMP _IOR('B', 131, u_int) #define BIOCSTSTAMP _IOW('B', 132, u_int) +#define BIOCQMASKENABLE _IO('B', 133) +#define BIOCQMASKDISABLE _IO('B', 134) +#define BIOCGRXQMASK _IOR('B', 135, struct bpf_qmask_bits) +#define BIOCSRXQMASK _IOW('B', 135, struct bpf_qmask_bits) +#define BIOCGTXQMASK _IOR('B', 136, struct bpf_qmask_bits) +#define BIOCSTXQMASK _IOW('B', 137, struct bpf_qmask_bits) +#define BIOCGNOQMASK _IOR('B', 138, int) +#define BIOCSNOQMASK _IOW('B', 139, int) /* Obsolete */ #define BIOCGSEESENT BIOCGDIRECTION Index: sys/sys/mbuf.h =================================================================== --- sys/sys/mbuf.h (.../head) (revision 255180) +++ sys/sys/mbuf.h (.../user/syuu/mq_bpf) (revision 255200) @@ -114,6 +114,11 @@ void (*m_tag_free)(struct m_tag *); }; +enum queuetype { + QUEUETYPE_RX, + QUEUETYPE_TX +}; + /* * Record/packet header in first mbuf of chain; valid only if M_PKTHDR is set. * Size ILP32: 48 @@ -126,6 +131,8 @@ /* Layer crossing persistent information. */ uint32_t flowid; /* packet's 4-tuple system */ + uint32_t queueid; /* hw queue id */ + uint32_t queuetype; /* hw queue type */ uint64_t csum_flags; /* checksum and offload features */ uint16_t fibnum; /* this packet should use this fib */ uint8_t cosqos; /* class/quality of service */ @@ -223,6 +230,7 @@ #define M_VLANTAG 0x00000080 /* ether_vtag is valid */ #define M_FLOWID 0x00000100 /* deprecated: flowid is valid */ #define M_NOFREE 0x00000200 /* do not free mbuf, embedded in cluster */ +#define M_QUEUEID 0x00000400 /* packet has hw queue id */ #define M_PROTO1 0x00001000 /* protocol-specific */ #define M_PROTO2 0x00002000 /* protocol-specific */ Index: sbin/ifconfig/ifconfig.c =================================================================== --- sbin/ifconfig/ifconfig.c (.../head) (revision 255180) +++ sbin/ifconfig/ifconfig.c (.../user/syuu/mq_bpf) (revision 255200) @@ -917,7 +917,7 @@ "\020\1RXCSUM\2TXCSUM\3NETCONS\4VLAN_MTU\5VLAN_HWTAGGING\6JUMBO_MTU\7POLLING" \ "\10VLAN_HWCSUM\11TSO4\12TSO6\13LRO\14WOL_UCAST\15WOL_MCAST\16WOL_MAGIC" \ "\17TOE4\20TOE6\21VLAN_HWFILTER\23VLAN_HWTSO\24LINKSTATE\25NETMAP" \ -"\26RXCSUM_IPV6\27TXCSUM_IPV6" +"\26RXCSUM_IPV6\27TXCSUM_IPV6\28QUEUEID" /* * Print the status of the interface. If an address family was
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CALG4x-UVgT9aXSBzrjDxeCD-f6Yo_TBeRsqjXapz2iyr_=tCLw>
