Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 31 May 2011 15:52:12 +0100
From:      "Robert N. M. Watson" <rwatson@freebsd.org>
To:        George Neville-Neil <gnn@FreeBSD.org>
Cc:        Takuya ASADA <syuu@dokukino.com>, soc-status@freebsd.org, Kazuya Goda <gockzy@gmail.com>
Subject:   Re: Weekly status report (27th May)
Message-ID:  <2EF14D0B-A3A1-4835-B07F-728BAFA5B0CB@freebsd.org>
In-Reply-To: <8259CBF7-B2E6-49C6-A7C4-6682ECBDBB9F@freebsd.org>
References:  <BANLkTim=zeRhwGajksbX2fBY9snkcj1h0g@mail.gmail.com> <8259CBF7-B2E6-49C6-A7C4-6682ECBDBB9F@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On 31 May 2011, at 14:52, George Neville-Neil wrote:

>> - Do we really need to expose queue information and threads to user
>> applications?
>=20
> There are applications that will want this information.
>=20
>> Probably most of BPF application requires to merge packet streams =
from
>> threads at last.
>> For example, sniffer app such as tcpdump and wireshark need to output
>> packet dump on a screen, before output it on the screen we need to
>> merge packet streams for each queues into one stream.
>> If so, isn't it better to merge stream in kernel, not userland?
>>=20
>>=20
>> I'm not really sure about use case of BPF, maybe there's use case can
>> get benefit from multithreaded BPF?
>=20
> Certainly there is a case for it, but perhaps not yet.  Let's get =
through the
> work you've already planned first.  I see the test case isn't written =
yet, so
> how are you testing these changes?  When I get some time, probably =
next week,
> I'll want to run some of this code myself.

The rationale to for exposing queues to userspace explicitly is the same =
as the rationale for exposing queues to the OS: it's not just packet =
data that has cache issues, but program data to do with processing =
packet data.

The reason for having each BPF device have a input and output queue =
masks in my initial thinking was that we would set then to =
0xffffffffffffffff by default, meaning that a particular BPF device =
would merge and collect all packets, picking an arbitrary ordering for =
interlacing the streams (as it does today). However, an application =
might decide to open multiple devices, one each having a particular bit =
set, in order to receive for a particular queue. The application could =
then have particular threads use particular BPF devices, and based on =
hardware having gotten the flow assignment right, it could then avoid =
cache line contention for statistics and even stateful processing of =
flows in its different threads. That actually gives you a spectrum =
between today's behaviour and greater levels of granularity, and lets =
the application decide "how early" to blend the different queues of =
data. It can ask the kernel to do it, or it can do it itself in =
userspace.

Robert=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2EF14D0B-A3A1-4835-B07F-728BAFA5B0CB>