From owner-freebsd-net@FreeBSD.ORG  Wed Feb 15 21:00:42 2006
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
X-Original-To: freebsd-net@freebsd.org
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 1B4B616A420
	for <freebsd-net@freebsd.org>; Wed, 15 Feb 2006 21:00:42 +0000 (GMT)
	(envelope-from andre@freebsd.org)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 5547A43D45
	for <freebsd-net@freebsd.org>; Wed, 15 Feb 2006 21:00:41 +0000 (GMT)
	(envelope-from andre@freebsd.org)
Received: (qmail 81755 invoked from network); 15 Feb 2006 20:56:53 -0000
Received: from c00l3r.networx.ch (HELO freebsd.org) ([62.48.2.2])
	(envelope-sender <andre@freebsd.org>)
	by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
	for <rizzo@icir.org>; 15 Feb 2006 20:56:53 -0000
Message-ID: <43F39692.7A3228BA@freebsd.org>
Date: Wed, 15 Feb 2006 22:01:06 +0100
From: Andre Oppermann <andre@freebsd.org>
X-Mailer: Mozilla 4.8 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
To: Luigi Rizzo <rizzo@icir.org>
References: <7bb8f24157080b6aaacb897a99259df9@madhaus.cns.utoronto.ca>
	<711b7ec873f31bc5be50ce477313fac3@madhaus.cns.utoronto.ca>
	<200602110002.21275.max@love2party.net>
	<43F38CF5.71C326C1@freebsd.org>
	<20060215123043.A29559@xorpc.icir.org>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Cc: Marcos Bedinelli <bedinelli@madhaus.cns.utoronto.ca>,
	Max Laier <max@love2party.net>,
	Julian Elischer <julian@elischer.org>, freebsd-net@freebsd.org
Subject: Re: Network performance in a dual CPU system
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Feb 2006 21:00:42 -0000

Luigi Rizzo wrote:
> 
> On Wed, Feb 15, 2006 at 09:20:05PM +0100, Andre Oppermann wrote:
> ...
> > >From my profiling with the Agilent tester there seem to be two areas where
> > the packet filters (ipfw in my test case) burn a lot of CPU per packet.
> > That is a) setup of lots of packet variables unconditionally at the entry
> > of ip_fw_chk() no matter whether they get looked at later or not, and b)
> > the switch() going through all the packet inspection options is for some
> > reason not optimized by the compiler and burns even more CPU.  Some sort
> > of JIT (as in the new bpf code) which replaces the case testing and jumps
> > directly to the proper place in the switch statement would go a long way
> > of making it way more performant.
> 
> i was expecting some overhead in the initial setting of
> variables but the cost of the switch() surprises me a bit.
> did you look at the assembly code produced, or otherwise
> could you explain a bit more how you think the switch
> affects performance ?
> Maybe one could make it cheaper through an indirect function call ?
> (in the end, instructions are already indexes for a jump table).

I didn't look at the assembler code as I can't do assembler.

In my testing (on UP) the peak forwarding rate on this particular hardware
with fastforwarding enabled dropped from 580kpps to 476kpps (ipfw allow all)
to 357kpps (30 non-matching rules on IP address).

The number of CPU instructions and branches per packet is as follows:

			maxkpps	instr.	branch	mispred	dcache	icfetch	icmiss
fastfwd			580	2238	300	 3.8	1429	 812	0.06
fastfwd+ipfw		476	2573	329	17.2	1721	1005	4.31
fastfwd+ipfw30		357	3493	508	15.2	2129	1500	3.35

The setup of the packet variables only happens once per packet. The overhead
thus must come from the micro-op evaluation.

-- 
Andre