Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 07 Jul 2008 13:13:19 +0200
From:      Andre Oppermann <andre@freebsd.org>
To:        Robert Watson <rwatson@FreeBSD.org>
Cc:        FreeBSD Net <freebsd-net@freebsd.org>, Paul <paul@gtcomm.net>
Subject:   Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]
Message-ID:  <4871FA4F.40206@freebsd.org>
In-Reply-To: <20080707114538.K63144@fledge.watson.org>
References:  <4867420D.7090406@gtcomm.net> <200806301944.m5UJifJD081781@lava.sentex.ca> <20080701004346.GA3898@stlux503.dsto.defence.gov.au> <alpine.LFD.1.10.0807010257570.19444@filebunker.xip.at> <20080701010716.GF3898@stlux503.dsto.defence.gov.au> <alpine.LFD.1.10.0807010308320.19444@filebunker.xip.at> <486986D9.3000607@monkeybrains.net> <48699960.9070100@gtcomm.net> <ea7b9c170806302005n2a66f592h2127f87a0ba2c6d2@mail.gmail.com> <20080701033117.GH83626@cdnetworks.co.kr> <ea7b9c170806302050p2a3a5480t29923a4ac2d7c852@mail.gmail.com> <4869ACFC.5020205@gtcomm.net> <4869B025.9080006@gtcomm.net> <486A7E45.3030902@gtcomm.net> <486A8F24.5010000@gtcomm.net> <486A9A0E.6060308@elischer.org> <486B41D5.3060609@gtcomm.net> <4871E85C.8090907@freebsd.org> <20080707114538.K63144@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Robert Watson wrote:
> 
> On Mon, 7 Jul 2008, Andre Oppermann wrote:
> 
>> Distributing the interrupts and taskqueues among the available CPUs 
>> gives concurrent forwarding with bi- or multi-directional traffic. All 
>> incoming traffic from any particular interface is still serialized 
>> though.
> 
> ... although not on multiple input queue-enabled hardware and drivers.  
> While I've really only focused on local traffic performance with my 
> 10gbps Chelsio setup, it should be possible to do packet forwarding from 
> multiple input queues using that hardware and driver today.
> 
> I'll update the netisr2 patches, which allow work to be pushed to 
> multiple CPUs from a single input queue.  However, these necessarily 
> take a cache miss or two on packet header data in order to break out the 
> packets from the input queue into flows that can be processed 
> independently without ordering constraints, so if those cache misses on 
> header data are a big part of the performance of a configuration, load 
> balancing in this manner may not help. What would be neat is if the 
> cards without multiple input queues could still tag receive descriptors 
> with a flow identifier generated from the IP/TCP/etc layers that could 
> be used for work placement.

The cache miss is really the elephant in the room.  If the network card
supports multiple RX rings with separate interrupts and a stable hash
based (that includes IP+Port src+dst) distribution they can be bound to
different CPUs.  It is very important to maintain the packet order for
flows that go through the router.  Otherwise TCP and VoIP will suffer.

-- 
Andre



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4871FA4F.40206>