Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 31 May 2011 09:52:32 -0400
From:      George Neville-Neil <gnn@freebsd.org>
To:        Takuya ASADA <syuu@dokukino.com>
Cc:        "Robert N. M. Watson" <rwatson@freebsd.org>, soc-status@freebsd.org, Kazuya Goda <gockzy@gmail.com>
Subject:   Re: Weekly status report (27th May)
Message-ID:  <8259CBF7-B2E6-49C6-A7C4-6682ECBDBB9F@freebsd.org>
In-Reply-To: <BANLkTim=zeRhwGajksbX2fBY9snkcj1h0g@mail.gmail.com>
References:  <BANLkTim=zeRhwGajksbX2fBY9snkcj1h0g@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On May 31, 2011, at 07:07 , Takuya ASADA wrote:

> Sorry for delaying weekly status report,
>=20
> * Overview
> Here are progress of the project:
> - Implement set affinity ioctl on BPF
>  Experimental code are implemented, worked
> - Implement affinity support on bpf_tap/bpf_mtap/bpf_mtap2
>  Experimental code are implemented, worked
> - Implement sample application
>  Quick hack for tcpdump/libpcap, worked
> - Implement multi-queue tap driver
>  Experimental core are implemented, not tested
> - Implement interface to deliver queue information on network device =
driver
>  Partially implemented on igb(4), not tested
> - Reduce lock granularity on bpf_tap/bpf_mtap/bpf_mtap2
>  Not yet
> - Implement test case
>  Not yet
> - Update man document, write description of sample code
>  Not yet
>=20
> * Detail
> On an ethernet card, bpf_mtap is called when RX/TX are performing.
> If the card supports multiqueue, every packets through bpf_mtap should
> belong to RX queue id or TX queue id.
> To handle this, I defined new members on mbuf pkthdr.
>=20
> In if_start function on igb(4), I added following line:
>  m->m_pkthdr.rxqid =3D (uint32_t)-1;
>  m->m_pkthdr.txqid =3D [tx queue id];
> And also receive function:
>  m->m_pkthdr.rxqid =3D [rx queue id];
>  m->m_pkthdr.txqid =3D (uint32_t)-1;
>=20
> Then I define following members on bpf descriptor:
>  d->bd_qmask.qm_enabled
>  d->bd_qmask.qm_rxq_mask[]
>  d->bd_qmask.qm_txq_mask[]
>=20
> Since qm_rxq_mask[] and qm_txq_mask[] size may differ on each cards,
> we need to pass size of queue from driver to bpf and allocate arrays
> by the size.
> I added them on struct ifnet:
>  d->bd_bif->bif_ifp->if_rxq_num
>  d->bd_bif->bif_ifp->if_txq_num
>=20
> Now we can filter unwanted packet on bpf_mtap like this:
>=20
> LIST_FOREACH(d, &bp->bif_dlist, bd_next) {
>  if (d->bd_qmask.qm_enabled) {
>    if (m->m_pkthdr.rxqid !=3D (uint32_t)-1 &&
> !d->bd_qmask.qm_rxq_mask[m->m_pkthdr.rxqid])
>      continue;
>    if (m->m_pkthdr.txqid !=3D (uint32_t)-1 &&
> !d->bd_qmask.qm_txq_mask[m->m_pkthdr.txqid])
>      continue;
> }
> d->bd_qmask.qm_enabled should FALSE by default to keep compatibility
> with existing applications.
>=20
> And here are ioctls for set/get queue mask:
>  #define BIOCENAQMASK    _IO('B', 137)
>    This does d->bd_qmask.qm_enabled =3D TRUE
>  #define BIOCDISQMASK    _IO('B', 138)
>    This does d->bd_qmask.qm_enabled =3D FALSE
>  #define BIOCRXQLEN      _IOR('B', 133, int)
>    Returns ifp->if_rxq_num
>  #define BIOCTXQLEN      _IOR('B', 134, int)
>    Returns ifp->if_txq_num
>  #define BIOCSTRXQMASK   _IOWR('B', 139, uint32_t)
>    This does d->bd_qmask.qm_rxq_mask[*addr] =3D TRUE
>  #define BIOCGTRXQMASK   _IOR('B', 140, uint32_t)
>    Returns d->bd_qmask.qm_rxq_mask[*addr]
>  /* XXX: We should have rxq_mask[*addr] =3D FALSE ioctl too */
>  #define BIOCSTTXQMASK   _IOWR('B', 141, uint32_t)
>    This does d->bd_qmask.qm_txq_mask[*addr] =3D TRUE
>  /* XXX: We should have txq_mask[*addr] =3D FALSE ioctl too */
>  #define BIOCGTTXQMASK   _IOR('B', 142, uint32_t)
>    Returns d->bd_qmask.qm_rxq_mask[*addr]
>=20
> However, the packet which comes bpf_tap doesn't have mbuf, we won't
> able to classify queue id for it.
> So I added d->bd_qmask.qm_other_mask and BIOSTOTHERMASK/BIOGTOTHERMASK =
for them.
> If d->bd_qmask.qm_enabled && !d->bd_qmask.qm_other_mask, all packets
> through bpf_tap will be ignored.
>=20
> If we only care about CPU affinity of packet / thread(=3D bpf
> descriptor), checking PCPU_GET(cpuid) is enough.
> But if we want to take care queue affinity, we probably need
> structures as referred to above.
>=20
> * Argument
> I discussed about this project with some Japanese BSD hackers, they
> argue this plan, suggested me two things:
>=20
> - Isn't it possible to filter by queue id in BPF filter language by =
extend it?
>=20

That's an interesting question, but it might be outside the scope of the =
project,
because you'd have to change both libpcap and tcpdump and we don't want =
to fork those.

> - Do we really need to expose queue information and threads to user
> applications?

There are applications that will want this information.

> Probably most of BPF application requires to merge packet streams from
> threads at last.
> For example, sniffer app such as tcpdump and wireshark need to output
> packet dump on a screen, before output it on the screen we need to
> merge packet streams for each queues into one stream.
> If so, isn't it better to merge stream in kernel, not userland?
>=20
>=20
> I'm not really sure about use case of BPF, maybe there's use case can
> get benefit from multithreaded BPF?

Certainly there is a case for it, but perhaps not yet.  Let's get =
through the
work you've already planned first.  I see the test case isn't written =
yet, so
how are you testing these changes?  When I get some time, probably next =
week,
I'll want to run some of this code myself.

Also, though it's probably required, the changes to the mbuf mean that =
you cannot
MFC (merge from current) this code to any older FreeBSD release.  If and =
when the work
is done it would only be able to go forwards.

Oh, and the work looks good to me so far. Good work.

Best,
George

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (Darwin)

iEYEARECAAYFAk3k8qAACgkQYdh2wUQKM9LpPQCgiZxxPJN6BDGPLJAUdAxjgzSJ
oaoAn27jCAFPeQdYU4AJvBWZaF1eqt1F
=3DS11+
-----END PGP SIGNATURE-----



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8259CBF7-B2E6-49C6-A7C4-6682ECBDBB9F>