Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 1 Jun 2011 18:02:00 +0900
From:      Takuya ASADA <syuu@dokukino.com>
To:        George Neville-Neil <gnn@freebsd.org>
Cc:        "Robert N. M. Watson" <rwatson@freebsd.org>, "soc-status@freebsd.org" <soc-status@freebsd.org>, Kazuya Goda <gockzy@gmail.com>
Subject:   Re: Weekly status report (27th May)
Message-ID:  <5054184174934880962@unknownmsgid>
In-Reply-To: <8259CBF7-B2E6-49C6-A7C4-6682ECBDBB9F@freebsd.org>
References:  <BANLkTim=zeRhwGajksbX2fBY9snkcj1h0g@mail.gmail.com> <8259CBF7-B2E6-49C6-A7C4-6682ECBDBB9F@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2011/05/31, at 22:52, George Neville-Neil <gnn@freebsd.org> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> On May 31, 2011, at 07:07 , Takuya ASADA wrote:
>
>> Sorry for delaying weekly status report,
>>
>> * Overview
>> Here are progress of the project:
>> - Implement set affinity ioctl on BPF
>> Experimental code are implemented, worked
>> - Implement affinity support on bpf_tap/bpf_mtap/bpf_mtap2
>> Experimental code are implemented, worked
>> - Implement sample application
>> Quick hack for tcpdump/libpcap, worked
>> - Implement multi-queue tap driver
>> Experimental core are implemented, not tested
>> - Implement interface to deliver queue information on network device driver
>> Partially implemented on igb(4), not tested
>> - Reduce lock granularity on bpf_tap/bpf_mtap/bpf_mtap2
>> Not yet
>> - Implement test case
>> Not yet
>> - Update man document, write description of sample code
>> Not yet
>>
>> * Detail
>> On an ethernet card, bpf_mtap is called when RX/TX are performing.
>> If the card supports multiqueue, every packets through bpf_mtap should
>> belong to RX queue id or TX queue id.
>> To handle this, I defined new members on mbuf pkthdr.
>>
>> In if_start function on igb(4), I added following line:
>> m->m_pkthdr.rxqid = (uint32_t)-1;
>> m->m_pkthdr.txqid = [tx queue id];
>> And also receive function:
>> m->m_pkthdr.rxqid = [rx queue id];
>> m->m_pkthdr.txqid = (uint32_t)-1;
>>
>> Then I define following members on bpf descriptor:
>> d->bd_qmask.qm_enabled
>> d->bd_qmask.qm_rxq_mask[]
>> d->bd_qmask.qm_txq_mask[]
>>
>> Since qm_rxq_mask[] and qm_txq_mask[] size may differ on each cards,
>> we need to pass size of queue from driver to bpf and allocate arrays
>> by the size.
>> I added them on struct ifnet:
>> d->bd_bif->bif_ifp->if_rxq_num
>> d->bd_bif->bif_ifp->if_txq_num
>>
>> Now we can filter unwanted packet on bpf_mtap like this:
>>
>> LIST_FOREACH(d, &bp->bif_dlist, bd_next) {
>> if (d->bd_qmask.qm_enabled) {
>>   if (m->m_pkthdr.rxqid != (uint32_t)-1 &&
>> !d->bd_qmask.qm_rxq_mask[m->m_pkthdr.rxqid])
>>     continue;
>>   if (m->m_pkthdr.txqid != (uint32_t)-1 &&
>> !d->bd_qmask.qm_txq_mask[m->m_pkthdr.txqid])
>>     continue;
>> }
>> d->bd_qmask.qm_enabled should FALSE by default to keep compatibility
>> with existing applications.
>>
>> And here are ioctls for set/get queue mask:
>> #define BIOCENAQMASK    _IO('B', 137)
>>   This does d->bd_qmask.qm_enabled = TRUE
>> #define BIOCDISQMASK    _IO('B', 138)
>>   This does d->bd_qmask.qm_enabled = FALSE
>> #define BIOCRXQLEN      _IOR('B', 133, int)
>>   Returns ifp->if_rxq_num
>> #define BIOCTXQLEN      _IOR('B', 134, int)
>>   Returns ifp->if_txq_num
>> #define BIOCSTRXQMASK   _IOWR('B', 139, uint32_t)
>>   This does d->bd_qmask.qm_rxq_mask[*addr] = TRUE
>> #define BIOCGTRXQMASK   _IOR('B', 140, uint32_t)
>>   Returns d->bd_qmask.qm_rxq_mask[*addr]
>> /* XXX: We should have rxq_mask[*addr] = FALSE ioctl too */
>> #define BIOCSTTXQMASK   _IOWR('B', 141, uint32_t)
>>   This does d->bd_qmask.qm_txq_mask[*addr] = TRUE
>> /* XXX: We should have txq_mask[*addr] = FALSE ioctl too */
>> #define BIOCGTTXQMASK   _IOR('B', 142, uint32_t)
>>   Returns d->bd_qmask.qm_rxq_mask[*addr]
>>
>> However, the packet which comes bpf_tap doesn't have mbuf, we won't
>> able to classify queue id for it.
>> So I added d->bd_qmask.qm_other_mask and BIOSTOTHERMASK/BIOGTOTHERMASK for them.
>> If d->bd_qmask.qm_enabled && !d->bd_qmask.qm_other_mask, all packets
>> through bpf_tap will be ignored.
>>
>> If we only care about CPU affinity of packet / thread(= bpf
>> descriptor), checking PCPU_GET(cpuid) is enough.
>> But if we want to take care queue affinity, we probably need
>> structures as referred to above.
>>
>> * Argument
>> I discussed about this project with some Japanese BSD hackers, they
>> argue this plan, suggested me two things:
>>
>> - Isn't it possible to filter by queue id in BPF filter language by extend it?
>>
>
> That's an interesting question, but it might be outside the scope of the project,
> because you'd have to change both libpcap and tcpdump and we don't want to fork those.
>
>> - Do we really need to expose queue information and threads to user
>> applications?
>
> There are applications that will want this information.
>
>> Probably most of BPF application requires to merge packet streams from
>> threads at last.
>> For example, sniffer app such as tcpdump and wireshark need to output
>> packet dump on a screen, before output it on the screen we need to
>> merge packet streams for each queues into one stream.
>> If so, isn't it better to merge stream in kernel, not userland?
>>
>>
>> I'm not really sure about use case of BPF, maybe there's use case can
>> get benefit from multithreaded BPF?
>
> Certainly there is a case for it, but perhaps not yet.  Let's get through the
> work you've already planned first.

Okay.

> I see the test case isn't written yet, so
> how are you testing these changes?

I modified libpcap/tcpdump just for the test - it can take extra
argument for filtering queues.
I'll send more detail of it when I get to home.
#That's too heavy work to do on my iPhone

>  When I get some time, probably next week,
> I'll want to run some of this code myself.
>
> Also, though it's probably required, the changes to the mbuf mean that you cannot
> MFC (merge from current) this code to any older FreeBSD release.  If and when the work
> is done it would only be able to go forwards.

Is that means it could be merge to next release, but it cannot
backport to older release, am I correct?

# Is it usual thing to backport new features for older releases
anyway? Probably I don't get understand FreeBSD's developing cycle yet

>
> Oh, and the work looks good to me so far. Good work.

Thanks.

> Best,
> George
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (Darwin)
>
> iEYEARECAAYFAk3k8qAACgkQYdh2wUQKM9LpPQCgiZxxPJN6BDGPLJAUdAxjgzSJ
> oaoAn27jCAFPeQdYU4AJvBWZaF1eqt1F
> =S11+
> -----END PGP SIGNATURE-----



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5054184174934880962>