From owner-freebsd-hackers@FreeBSD.ORG Mon Jun 14 09:47:39 2004 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8356016A4CE for ; Mon, 14 Jun 2004 09:47:39 +0000 (GMT) Received: from oasis.uptsoft.com (oasis.uptsoft.com [217.20.165.41]) by mx1.FreeBSD.org (Postfix) with ESMTP id D0D6243D5D for ; Mon, 14 Jun 2004 09:47:37 +0000 (GMT) (envelope-from devnull@oasis.uptsoft.com) Received: (from devnull@localhost) by oasis.uptsoft.com (8.11.6/linuxconf) id i5E9l8B23536 for freebsd-hackers@freebsd.org; Mon, 14 Jun 2004 12:47:08 +0300 Date: Mon, 14 Jun 2004 12:47:08 +0300 From: Sergey Lyubka To: freebsd-hackers@freebsd.org Message-ID: <20040614124708.A22679@oasis.uptsoft.com> Mail-Followup-To: freebsd-hackers@freebsd.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i Subject: memory mapped packet capturing - bpf replacement ? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Jun 2004 09:47:39 -0000 Traffic analysis applications running on high-speed networks are using BPF to capture packet. However, on heavy traffic frequent context switches caused by read(2) from a BPF device is a overhead. I implemented a prototype module that stores captured packets in user mmap-able area. I want to describe the prototype and discuss benchmark results. The module is a netgraph node, called ng_mmq. mmq stands for memory-mapped queue. The node has one hook, called "input". When this hook is connected, o memory buffer is allocated. size is controlled by the debug.mmq_size sysctl. o a device /dev/mmqX is created, where X is a node ID o /dev/mmqX is mmap-able by the user, mmap() returns an allocated buffer o when packet arrives on hook, it is copied to the buffer, which is actually a ringbuffer. The ringbuffer's head is advanced. o user spins until tail != head, which means new data arrived. Then it reads from ringbuffer, and advances the tail. o no mutexes are used The code is at So this is the basic idea. I connected ng_mmq node to my rl0: ethernet node via the ng_hub, and benchmarked it against the pcap, using the same pcap callback function. Packet processing was simulated by the delay() function that just takes some CPU cycles. What I have found is: 1. bpf seems to be faster, i.e. it drops less packets than mmq 2. mmq seems to capture more packets. This is sample output from the benchmark utility: # ./benchmark rl0 /dev/mmq5 1000 pcap: rcvd: 15061, dropped: 14047, seen: 1000 mmq: rcvd: 23172, dropped: 21789, seen: 1000 Now, the questions: 1. is my interpretation of benchmark results correct? 2. if they are correct, why bpf is faster? 3. is it OK to have no mutexes for ringbuffer operations ? The ng_mmq code, as well as the benchmark code, are at http://oasis.uptsoft.com/~devnull/ng_mmq/ Setup instructions are at http://oasis.uptsoft.com/~devnull/ng_mmq/README -- Sergey Lyubka, Network Security Consultant NetFort Technologies Ltd, Galway, Ireland