From owner-freebsd-hackers@FreeBSD.ORG  Mon Jun 14 09:47:39 2004
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 8356016A4CE
	for <freebsd-hackers@freebsd.org>;
	Mon, 14 Jun 2004 09:47:39 +0000 (GMT)
Received: from oasis.uptsoft.com (oasis.uptsoft.com [217.20.165.41])
	by mx1.FreeBSD.org (Postfix) with ESMTP id D0D6243D5D
	for <freebsd-hackers@freebsd.org>;
	Mon, 14 Jun 2004 09:47:37 +0000 (GMT)
	(envelope-from devnull@oasis.uptsoft.com)
Received: (from devnull@localhost)
	by oasis.uptsoft.com (8.11.6/linuxconf) id i5E9l8B23536
	for freebsd-hackers@freebsd.org; Mon, 14 Jun 2004 12:47:08 +0300
Date: Mon, 14 Jun 2004 12:47:08 +0300
From: Sergey Lyubka <devnull@uptsoft.com>
To: freebsd-hackers@freebsd.org
Message-ID: <20040614124708.A22679@oasis.uptsoft.com>
Mail-Followup-To: freebsd-hackers@freebsd.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5.1i
Subject: memory mapped packet capturing - bpf replacement ?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Jun 2004 09:47:39 -0000

Traffic analysis applications running on high-speed networks are
using BPF to capture packet. However, on heavy traffic frequent
context switches caused by read(2) from a BPF device is a overhead.

I implemented a prototype module that stores captured
 packets in user mmap-able area. I want to describe the prototype and
discuss benchmark results.

The module is a netgraph node, called ng_mmq. mmq stands for
memory-mapped queue. The node has one hook, called "input".
When this hook is connected,
	o memory buffer is allocated. size is controlled by the
	debug.mmq_size sysctl.
	o a device /dev/mmqX is created, where X is a node ID
	o /dev/mmqX is mmap-able by the user, mmap() returns an
	allocated buffer
	o when packet arrives on hook, it is copied to the buffer,
	which is actually a ringbuffer. The ringbuffer's head is
	advanced.
	o user spins until tail != head, which means new data arrived.
	Then it reads from ringbuffer, and advances the tail.
	o no mutexes are used

The code is at 

So this is the basic idea. I connected ng_mmq node to my rl0:
ethernet node via the ng_hub, and benchmarked it against the
pcap, using the same pcap callback function. Packet processing was
simulated by the delay() function that just takes some CPU cycles.
What I have found is:
	1. bpf seems to be faster, i.e. it drops less packets than mmq
	2. mmq seems to capture more packets.

This is sample output from the benchmark utility:
# ./benchmark rl0 /dev/mmq5 1000
pcap: rcvd: 15061, dropped: 14047, seen: 1000
mmq: rcvd: 23172, dropped: 21789, seen: 1000

Now, the questions:
	1. is my interpretation of benchmark results correct?
	2. if they are correct, why bpf is faster?
	3. is it OK to have no mutexes for ringbuffer operations ?

The ng_mmq code, as well as the benchmark code, are at
http://oasis.uptsoft.com/~devnull/ng_mmq/

Setup instructions are at 
http://oasis.uptsoft.com/~devnull/ng_mmq/README


-- 
Sergey Lyubka, Network Security Consultant
NetFort Technologies Ltd, Galway, Ireland