From owner-soc-status@FreeBSD.ORG Tue May 31 19:01:40 2011 Return-Path: Delivered-To: soc-status@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 58D3F10656A6 for ; Tue, 31 May 2011 19:01:40 +0000 (UTC) (envelope-from gnn@freebsd.org) Received: from vps.hungerhost.com (vps.hungerhost.com [216.38.53.176]) by mx1.freebsd.org (Postfix) with ESMTP id 1A0588FC15 for ; Tue, 31 May 2011 19:01:39 +0000 (UTC) Received: from [209.249.190.124] (helo=gnnmac.hudson-trading.com) by vps.hungerhost.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.69) (envelope-from ) id 1QRPN3-0004AP-4z; Tue, 31 May 2011 09:52:33 -0400 Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: George Neville-Neil In-Reply-To: Date: Tue, 31 May 2011 09:52:32 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: <8259CBF7-B2E6-49C6-A7C4-6682ECBDBB9F@freebsd.org> References: To: Takuya ASADA X-Pgp-Agent: GPGMail 1.3.3 X-Mailer: Apple Mail (2.1084) X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - vps.hungerhost.com X-AntiAbuse: Original Domain - freebsd.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - freebsd.org Cc: "Robert N. M. Watson" , soc-status@freebsd.org, Kazuya Goda Subject: Re: Weekly status report (27th May) X-BeenThere: soc-status@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Summer of Code Status Reports and Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 May 2011 19:01:40 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On May 31, 2011, at 07:07 , Takuya ASADA wrote: > Sorry for delaying weekly status report, >=20 > * Overview > Here are progress of the project: > - Implement set affinity ioctl on BPF > Experimental code are implemented, worked > - Implement affinity support on bpf_tap/bpf_mtap/bpf_mtap2 > Experimental code are implemented, worked > - Implement sample application > Quick hack for tcpdump/libpcap, worked > - Implement multi-queue tap driver > Experimental core are implemented, not tested > - Implement interface to deliver queue information on network device = driver > Partially implemented on igb(4), not tested > - Reduce lock granularity on bpf_tap/bpf_mtap/bpf_mtap2 > Not yet > - Implement test case > Not yet > - Update man document, write description of sample code > Not yet >=20 > * Detail > On an ethernet card, bpf_mtap is called when RX/TX are performing. > If the card supports multiqueue, every packets through bpf_mtap should > belong to RX queue id or TX queue id. > To handle this, I defined new members on mbuf pkthdr. >=20 > In if_start function on igb(4), I added following line: > m->m_pkthdr.rxqid =3D (uint32_t)-1; > m->m_pkthdr.txqid =3D [tx queue id]; > And also receive function: > m->m_pkthdr.rxqid =3D [rx queue id]; > m->m_pkthdr.txqid =3D (uint32_t)-1; >=20 > Then I define following members on bpf descriptor: > d->bd_qmask.qm_enabled > d->bd_qmask.qm_rxq_mask[] > d->bd_qmask.qm_txq_mask[] >=20 > Since qm_rxq_mask[] and qm_txq_mask[] size may differ on each cards, > we need to pass size of queue from driver to bpf and allocate arrays > by the size. > I added them on struct ifnet: > d->bd_bif->bif_ifp->if_rxq_num > d->bd_bif->bif_ifp->if_txq_num >=20 > Now we can filter unwanted packet on bpf_mtap like this: >=20 > LIST_FOREACH(d, &bp->bif_dlist, bd_next) { > if (d->bd_qmask.qm_enabled) { > if (m->m_pkthdr.rxqid !=3D (uint32_t)-1 && > !d->bd_qmask.qm_rxq_mask[m->m_pkthdr.rxqid]) > continue; > if (m->m_pkthdr.txqid !=3D (uint32_t)-1 && > !d->bd_qmask.qm_txq_mask[m->m_pkthdr.txqid]) > continue; > } > d->bd_qmask.qm_enabled should FALSE by default to keep compatibility > with existing applications. >=20 > And here are ioctls for set/get queue mask: > #define BIOCENAQMASK _IO('B', 137) > This does d->bd_qmask.qm_enabled =3D TRUE > #define BIOCDISQMASK _IO('B', 138) > This does d->bd_qmask.qm_enabled =3D FALSE > #define BIOCRXQLEN _IOR('B', 133, int) > Returns ifp->if_rxq_num > #define BIOCTXQLEN _IOR('B', 134, int) > Returns ifp->if_txq_num > #define BIOCSTRXQMASK _IOWR('B', 139, uint32_t) > This does d->bd_qmask.qm_rxq_mask[*addr] =3D TRUE > #define BIOCGTRXQMASK _IOR('B', 140, uint32_t) > Returns d->bd_qmask.qm_rxq_mask[*addr] > /* XXX: We should have rxq_mask[*addr] =3D FALSE ioctl too */ > #define BIOCSTTXQMASK _IOWR('B', 141, uint32_t) > This does d->bd_qmask.qm_txq_mask[*addr] =3D TRUE > /* XXX: We should have txq_mask[*addr] =3D FALSE ioctl too */ > #define BIOCGTTXQMASK _IOR('B', 142, uint32_t) > Returns d->bd_qmask.qm_rxq_mask[*addr] >=20 > However, the packet which comes bpf_tap doesn't have mbuf, we won't > able to classify queue id for it. > So I added d->bd_qmask.qm_other_mask and BIOSTOTHERMASK/BIOGTOTHERMASK = for them. > If d->bd_qmask.qm_enabled && !d->bd_qmask.qm_other_mask, all packets > through bpf_tap will be ignored. >=20 > If we only care about CPU affinity of packet / thread(=3D bpf > descriptor), checking PCPU_GET(cpuid) is enough. > But if we want to take care queue affinity, we probably need > structures as referred to above. >=20 > * Argument > I discussed about this project with some Japanese BSD hackers, they > argue this plan, suggested me two things: >=20 > - Isn't it possible to filter by queue id in BPF filter language by = extend it? >=20 That's an interesting question, but it might be outside the scope of the = project, because you'd have to change both libpcap and tcpdump and we don't want = to fork those. > - Do we really need to expose queue information and threads to user > applications? There are applications that will want this information. > Probably most of BPF application requires to merge packet streams from > threads at last. > For example, sniffer app such as tcpdump and wireshark need to output > packet dump on a screen, before output it on the screen we need to > merge packet streams for each queues into one stream. > If so, isn't it better to merge stream in kernel, not userland? >=20 >=20 > I'm not really sure about use case of BPF, maybe there's use case can > get benefit from multithreaded BPF? Certainly there is a case for it, but perhaps not yet. Let's get = through the work you've already planned first. I see the test case isn't written = yet, so how are you testing these changes? When I get some time, probably next = week, I'll want to run some of this code myself. Also, though it's probably required, the changes to the mbuf mean that = you cannot MFC (merge from current) this code to any older FreeBSD release. If and = when the work is done it would only be able to go forwards. Oh, and the work looks good to me so far. Good work. Best, George -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (Darwin) iEYEARECAAYFAk3k8qAACgkQYdh2wUQKM9LpPQCgiZxxPJN6BDGPLJAUdAxjgzSJ oaoAn27jCAFPeQdYU4AJvBWZaF1eqt1F =3DS11+ -----END PGP SIGNATURE-----