Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 7 Jul 2008 11:48:37 +0100 (BST)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        Andre Oppermann <andre@freebsd.org>
Cc:        FreeBSD Net <freebsd-net@freebsd.org>, Paul <paul@gtcomm.net>
Subject:   Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]
Message-ID:  <20080707114538.K63144@fledge.watson.org>
In-Reply-To: <4871E85C.8090907@freebsd.org>
References:  <4867420D.7090406@gtcomm.net> <200806301944.m5UJifJD081781@lava.sentex.ca> <20080701004346.GA3898@stlux503.dsto.defence.gov.au> <alpine.LFD.1.10.0807010257570.19444@filebunker.xip.at> <20080701010716.GF3898@stlux503.dsto.defence.gov.au> <alpine.LFD.1.10.0807010308320.19444@filebunker.xip.at> <486986D9.3000607@monkeybrains.net> <48699960.9070100@gtcomm.net> <ea7b9c170806302005n2a66f592h2127f87a0ba2c6d2@mail.gmail.com> <20080701033117.GH83626@cdnetworks.co.kr> <ea7b9c170806302050p2a3a5480t29923a4ac2d7c852@mail.gmail.com> <4869ACFC.5020205@gtcomm.net> <4869B025.9080006@gtcomm.net> <486A7E45.3030902@gtcomm.net> <486A8F24.5010000@gtcomm.net> <486A9A0E.6060308@elischer.org> <486B41D5.3060609@gtcomm.net> <4871E85C.8090907@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On Mon, 7 Jul 2008, Andre Oppermann wrote:

> Distributing the interrupts and taskqueues among the available CPUs gives 
> concurrent forwarding with bi- or multi-directional traffic. All incoming 
> traffic from any particular interface is still serialized though.

... although not on multiple input queue-enabled hardware and drivers.  While 
I've really only focused on local traffic performance with my 10gbps Chelsio 
setup, it should be possible to do packet forwarding from multiple input 
queues using that hardware and driver today.

I'll update the netisr2 patches, which allow work to be pushed to multiple 
CPUs from a single input queue.  However, these necessarily take a cache miss 
or two on packet header data in order to break out the packets from the input 
queue into flows that can be processed independently without ordering 
constraints, so if those cache misses on header data are a big part of the 
performance of a configuration, load balancing in this manner may not help. 
What would be neat is if the cards without multiple input queues could still 
tag receive descriptors with a flow identifier generated from the IP/TCP/etc 
layers that could be used for work placement.

Robert N M Watson
Computer Laboratory
University of Cambridge

>
> -- 
> Andre
>
>> I would be willing to set up test equipment (several servers plugged into a 
>> switch) with ipkvm and power port access
>> if someone or a group of people want to figure out ways to improve the 
>> routing process, ipfw, and lagg.
>> 
>> Maximum PPS with one ipfw rule on UP:
>> tops out about 570Kpps.. almost 200kpps lower ? (frown)
>> 
>> I'm going to drop in a 3ghz opteron instead of the 2ghz 2212 that's in here 
>> and see how that scales, using UP same kernel etc I have now.
>> 
>> 
>> 
>> 
>> 
>> Julian Elischer wrote:
>>> Paul wrote:
>>>> ULE without PREEMPTION is now yeilding better results.
>>>>         input          (em0)           output
>>>>   packets  errs      bytes    packets  errs      bytes colls
>>>>    571595 40639   34564108          1     0        226     0
>>>>    577892 48865   34941908          1     0        178     0
>>>>    545240 84744   32966404          1     0        178     0
>>>>    587661 44691   35534512          1     0        178     0
>>>>    587839 38073   35544904          1     0        178     0
>>>>    587787 43556   35540360          1     0        178     0
>>>>    540786 39492   32712746          1     0        178     0
>>>>    572071 55797   34595650          1     0        178     0
>>>>  *OUCH, IPFW HURTS..
>>>> loading ipfw, and adding one ipfw rule allow ip from any to any drops 
>>>> 100Kpps off :/ what's up with THAT?
>>>> unloaded ipfw module and back 100kpps more again, that's not right with 
>>>> ONE rule.. :/
>>> 
>>> ipfw need sto gain a lock on hte firewall before running,
>>> and is quite complex..  I can believe it..
>>> 
>>> in FreeBSD 4.8 I was able to use ipfw and filter 1Gb between two 
>>> interfaces (bridged) but I think it has slowed down since then due to the 
>>> SMP locking.
>>> 
>>> 
>>>> 
>>>> em0 taskq is still jumping cpus.. is there any way to lock it to one cpu 
>>>> or is this just a function of ULE
>>>> 
>>>> running a tar czpvf all.tgz *  and seeing if pps changes..
>>>> negligible.. guess scheduler is doing it's job at least..
>>>> 
>>>> Hmm. even when it's getting 50-60k errors per second on the interface I 
>>>> can still SCP a file through that interface although it's not fast.. 
>>>> 3-4MB/s..
>>>> 
>>>> You know, I wouldn't care if it added 5ms latency to the packets when it 
>>>> was doing 1mpps as long as it didn't drop any.. Why can't it do that? 
>>>> Queue them up and do them in bigggg chunks so none are 
>>>> dropped........hmm?
>>>> 
>>>> 32 bit system is compiling now..  won't do > 400kpps with GENERIC kernel, 
>>>> as with 64 bit did 450k with GENERIC, although that could be
>>>> the difference between opteron 270 and opteron 2212..
>>>> 
>>>> Paul
>>>> 
>>>> _______________________________________________
>>>> freebsd-net@freebsd.org mailing list
>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>>> 
>>> 
>> 
>> _______________________________________________
>> freebsd-net@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>> 
>> 
>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080707114538.K63144>