Date: Wed, 1 Jun 2011 18:02:00 +0900 From: Takuya ASADA <syuu@dokukino.com> To: George Neville-Neil <gnn@freebsd.org> Cc: "Robert N. M. Watson" <rwatson@freebsd.org>, "soc-status@freebsd.org" <soc-status@freebsd.org>, Kazuya Goda <gockzy@gmail.com> Subject: Re: Weekly status report (27th May) Message-ID: <5054184174934880962@unknownmsgid> In-Reply-To: <8259CBF7-B2E6-49C6-A7C4-6682ECBDBB9F@freebsd.org> References: <BANLkTim=zeRhwGajksbX2fBY9snkcj1h0g@mail.gmail.com> <8259CBF7-B2E6-49C6-A7C4-6682ECBDBB9F@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2011/05/31, at 22:52, George Neville-Neil <gnn@freebsd.org> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > > On May 31, 2011, at 07:07 , Takuya ASADA wrote: > >> Sorry for delaying weekly status report, >> >> * Overview >> Here are progress of the project: >> - Implement set affinity ioctl on BPF >> Experimental code are implemented, worked >> - Implement affinity support on bpf_tap/bpf_mtap/bpf_mtap2 >> Experimental code are implemented, worked >> - Implement sample application >> Quick hack for tcpdump/libpcap, worked >> - Implement multi-queue tap driver >> Experimental core are implemented, not tested >> - Implement interface to deliver queue information on network device driver >> Partially implemented on igb(4), not tested >> - Reduce lock granularity on bpf_tap/bpf_mtap/bpf_mtap2 >> Not yet >> - Implement test case >> Not yet >> - Update man document, write description of sample code >> Not yet >> >> * Detail >> On an ethernet card, bpf_mtap is called when RX/TX are performing. >> If the card supports multiqueue, every packets through bpf_mtap should >> belong to RX queue id or TX queue id. >> To handle this, I defined new members on mbuf pkthdr. >> >> In if_start function on igb(4), I added following line: >> m->m_pkthdr.rxqid = (uint32_t)-1; >> m->m_pkthdr.txqid = [tx queue id]; >> And also receive function: >> m->m_pkthdr.rxqid = [rx queue id]; >> m->m_pkthdr.txqid = (uint32_t)-1; >> >> Then I define following members on bpf descriptor: >> d->bd_qmask.qm_enabled >> d->bd_qmask.qm_rxq_mask[] >> d->bd_qmask.qm_txq_mask[] >> >> Since qm_rxq_mask[] and qm_txq_mask[] size may differ on each cards, >> we need to pass size of queue from driver to bpf and allocate arrays >> by the size. >> I added them on struct ifnet: >> d->bd_bif->bif_ifp->if_rxq_num >> d->bd_bif->bif_ifp->if_txq_num >> >> Now we can filter unwanted packet on bpf_mtap like this: >> >> LIST_FOREACH(d, &bp->bif_dlist, bd_next) { >> if (d->bd_qmask.qm_enabled) { >> if (m->m_pkthdr.rxqid != (uint32_t)-1 && >> !d->bd_qmask.qm_rxq_mask[m->m_pkthdr.rxqid]) >> continue; >> if (m->m_pkthdr.txqid != (uint32_t)-1 && >> !d->bd_qmask.qm_txq_mask[m->m_pkthdr.txqid]) >> continue; >> } >> d->bd_qmask.qm_enabled should FALSE by default to keep compatibility >> with existing applications. >> >> And here are ioctls for set/get queue mask: >> #define BIOCENAQMASK _IO('B', 137) >> This does d->bd_qmask.qm_enabled = TRUE >> #define BIOCDISQMASK _IO('B', 138) >> This does d->bd_qmask.qm_enabled = FALSE >> #define BIOCRXQLEN _IOR('B', 133, int) >> Returns ifp->if_rxq_num >> #define BIOCTXQLEN _IOR('B', 134, int) >> Returns ifp->if_txq_num >> #define BIOCSTRXQMASK _IOWR('B', 139, uint32_t) >> This does d->bd_qmask.qm_rxq_mask[*addr] = TRUE >> #define BIOCGTRXQMASK _IOR('B', 140, uint32_t) >> Returns d->bd_qmask.qm_rxq_mask[*addr] >> /* XXX: We should have rxq_mask[*addr] = FALSE ioctl too */ >> #define BIOCSTTXQMASK _IOWR('B', 141, uint32_t) >> This does d->bd_qmask.qm_txq_mask[*addr] = TRUE >> /* XXX: We should have txq_mask[*addr] = FALSE ioctl too */ >> #define BIOCGTTXQMASK _IOR('B', 142, uint32_t) >> Returns d->bd_qmask.qm_rxq_mask[*addr] >> >> However, the packet which comes bpf_tap doesn't have mbuf, we won't >> able to classify queue id for it. >> So I added d->bd_qmask.qm_other_mask and BIOSTOTHERMASK/BIOGTOTHERMASK for them. >> If d->bd_qmask.qm_enabled && !d->bd_qmask.qm_other_mask, all packets >> through bpf_tap will be ignored. >> >> If we only care about CPU affinity of packet / thread(= bpf >> descriptor), checking PCPU_GET(cpuid) is enough. >> But if we want to take care queue affinity, we probably need >> structures as referred to above. >> >> * Argument >> I discussed about this project with some Japanese BSD hackers, they >> argue this plan, suggested me two things: >> >> - Isn't it possible to filter by queue id in BPF filter language by extend it? >> > > That's an interesting question, but it might be outside the scope of the project, > because you'd have to change both libpcap and tcpdump and we don't want to fork those. > >> - Do we really need to expose queue information and threads to user >> applications? > > There are applications that will want this information. > >> Probably most of BPF application requires to merge packet streams from >> threads at last. >> For example, sniffer app such as tcpdump and wireshark need to output >> packet dump on a screen, before output it on the screen we need to >> merge packet streams for each queues into one stream. >> If so, isn't it better to merge stream in kernel, not userland? >> >> >> I'm not really sure about use case of BPF, maybe there's use case can >> get benefit from multithreaded BPF? > > Certainly there is a case for it, but perhaps not yet. Let's get through the > work you've already planned first. Okay. > I see the test case isn't written yet, so > how are you testing these changes? I modified libpcap/tcpdump just for the test - it can take extra argument for filtering queues. I'll send more detail of it when I get to home. #That's too heavy work to do on my iPhone > When I get some time, probably next week, > I'll want to run some of this code myself. > > Also, though it's probably required, the changes to the mbuf mean that you cannot > MFC (merge from current) this code to any older FreeBSD release. If and when the work > is done it would only be able to go forwards. Is that means it could be merge to next release, but it cannot backport to older release, am I correct? # Is it usual thing to backport new features for older releases anyway? Probably I don't get understand FreeBSD's developing cycle yet > > Oh, and the work looks good to me so far. Good work. Thanks. > Best, > George > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.11 (Darwin) > > iEYEARECAAYFAk3k8qAACgkQYdh2wUQKM9LpPQCgiZxxPJN6BDGPLJAUdAxjgzSJ > oaoAn27jCAFPeQdYU4AJvBWZaF1eqt1F > =S11+ > -----END PGP SIGNATURE-----
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5054184174934880962>