From owner-freebsd-net@FreeBSD.ORG Tue Feb 6 14:04:51 2007 Return-Path: X-Original-To: freebsd-net@freebsd.org Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id EA9BF16A403 for ; Tue, 6 Feb 2007 14:04:51 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 8D56D13C4A3 for ; Tue, 6 Feb 2007 14:04:51 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 392AD46E4C; Tue, 6 Feb 2007 09:04:49 -0500 (EST) Date: Tue, 6 Feb 2007 14:04:49 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Mike Silbersack In-Reply-To: <20070206013506.M25997@odysseus.silby.com> Message-ID: <20070206135638.Y32369@fledge.watson.org> References: <20070117135629.a02ada2f.rnsanchez@wait4.org> <20070206013506.M25997@odysseus.silby.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-net@freebsd.org, Ricardo Nabinger Sanchez Subject: Re: network related benchmark X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Feb 2007 14:04:52 -0000 On Tue, 6 Feb 2007, Mike Silbersack wrote: > On Wed, 17 Jan 2007, Ricardo Nabinger Sanchez wrote: > >> Accidentally I got into this PDF: >> >> http://www.dcs.qmul.ac.uk/~awm/slides/masterclass2006/monitor-hardware.pdf >> >> Quite interesting results, and nice future work. Has anybody seen it >> already? > > I believe that there is a FreeBSD developer working on an updated version of > BPF that will perform even better than it does already. I'll let him reveal > himself if he wishes. :) Heh. Indeed. I've created a zero-copy (well, one-copy) BPF implementation under contract to a customer. We're still working on refining and measuring performance of the implementation, but we're already seeing marked performance improvement as a result of reduced memory copies. We don't currently have a CVS merge ETA, but I think there's a reasonable chance the implementation will appear in FreeBSD 6.3. Those interested in perusing the WIP can find it in the FreeBSD Perforce repository: //depot/projects/zcopybpf/... It includes a modified version of libpcap. We're still working on improving the event model--in particular, we are working in improving the ioctl set to allow "timeouts" to be more easily supported. Be warned that this code is very much "under development" and further changes to the ioctl API are expected. Just to clarify the copying point: right now the FreeBSD BPF implementation performs a minimum of two memory copies per packet sniffed: one copy from the mbufs/mbuf clusters to the BPF buffer, and then one copy from the BPF buffer to user space. This implementation allows user space to register user memory buffers with the kernel, which are mapped into the kernel address space, pinned into memory, and then used in place of kernel memory buffers and written to directly during capture. This does not eliminate the in-kernel copy, just the kernel->userspace copy. For a variety of reasons, eliminating the in-kernel copy is difficult and possibly undesirable. As the in-kernel memory layout for BPF buffers is identical to the layout of memory copied to in userspace, libpcap (and other BPF consumers) require modifications to their event handling and ioctl handling, but not to packet parsing routines, so changes are pretty minimal. Consumers accessing BPF through libpcap will not require any modification. I hope to start looking at 10gbps performance in the next week or two; until now we've been doing 1gbps measurement as the testing is occuring at the customer site. Robert N M Watson Computer Laboratory University of Cambridge