Date: Sun, 30 Dec 2007 17:49:33 +0000 (GMT) From: Robert Watson <rwatson@FreeBSD.org> To: Tiffany Snyder <tiffany.snyder@gmail.com> Cc: freebsd-net@freebsd.org, Andre Oppermann <andre@freebsd.org> Subject: Re: Routing SMP benefit Message-ID: <20071230172808.J7115@fledge.watson.org> In-Reply-To: <b63e753b0712281551u52894ed9mb0dd55a988bc9c7a@mail.gmail.com> References: <43B45EEF.6060800@x-trader.de> <43B47CB5.3C0F1632@freebsd.org> <b63e753b0712281551u52894ed9mb0dd55a988bc9c7a@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 28 Dec 2007, Tiffany Snyder wrote: > What piqued my attention was the note that our forwarding performance > doesn't scale with multiple CPUs. Which means there's a lot of work to be > done :-) Have we taken a look at OpenSolaris' Surya > (http://www.opensolaris.org/os/community/networking/surya-design.pdf) > project? They allow multiple readers/single writer on the radix_node_head > (and not a mutex as we do) and we may be able to do the same to gain some > parallelism. There are other things in Surya that exploit multiple CPUs. > It's definitely worth a read. DragonFlyBSD seems to achieve parallelism by > classifying packet as flows and then redirecting the flows to different > CPUs. OpenSolaris also does something similar. We can definitely think along > those lines. > > NOTE: 1) I said multiple instead of dual CPUs on purpose. 2) I mentioned > OpenSolaris and DragonFlyBSD as examples and to acknowledge the work they > are doing and to show that FreeBSD is far behind and is losing it's lustre > on continuing to be the networking platform of choice. Thanks for your kind comments, we always appreciate it when people tell us we've lost our lustre and are far behind. A few more substantive comments-- Kip Macy has been floating a patch to move to reader-writer locking on the routing code, but I'm not sure how mature it is. Certainly, we are aware that routing lock contention is a key issue we need to look at for forwarding performance. If he doesn't follow up to this thread, give him a ping for a copy of the patch. I have a prototype work dispatch system for TCP/IP on FreeBSD based on packet flow classification, such as the ones you point at, which can be found in the FreeBSD perforce server. However, life is a bit more complicated than just identifying flows and assigning them to CPUs. In FreeBSD 7, we perform direct dispatch of the network stack from the interrupt thread in order to avoid context switch overhead, which mean there isn't a natural deferal point, such as the netisr dispatch. In practice, for high performance devices, this is both highly desirable in order to avoid dispatch overheads, but also because 10gbps cards almost always support in-hardware dispatch to multiple input queues. This is much preferable pulling packet headers to a single CPU in one interrupt thread, and then bouncing the cache lines for the packets to other CPUs for actual processing. Our Chelsio 10gbps driver, for example, supports work dispatch to multiple input threads today, so you will already see parallel processing of input flows from that hardware via multiple ithreads. For other devices, where either the hardware or device driver doesn't support the in-hardware demux, having software processing is a good idea, but in order for that to pay off, routing lock granularity and TCP input lock granularity both need attention. Per the FreeBSD 8 TCP projects page, I've begun work on TCP locking and affinity, but I won't have too much substantive progress to show for a couple of months. I can provide netisr parallel dispatch patches, a bit dated but probably not hard to update, on request. So, suffice to say, we are aware that we have more work to do in the area of parallel work distribution, and by implication, lock granularity in some of our network stack subsystems. However, I think we're quite well placed to do it, as we already ship the OS with parallel input queue processing due to direct dispatch, and have a solid and generally fairly granular locking implementation. Robert N M Watson Computer Laboratory University of Cambridge > > Thanks, > > Tiffany. > > > On 12/29/05, Andre Oppermann <andre@freebsd.org > wrote: > >> Markus Oestreicher wrote: >>> >>> Currently running a few routers on 5-STABLE I have read the >>> recent changes in the network stack with interest. >> >> You should run 6.0R. It contains many improvements over 5-STABLE. >> >>> A few questions come to my mind: >>> >>> - Can a machine that mainly routes packets between two em(4) >>> interfaces benefit from a second CPU and SMP kernel? Can both >>> CPUs process packets from the same interface in parallel? >> >> My testing has shown that a machine can benefit from it but not >> much in the forwarding performance. The main benefit is the >> prevention of lifelock if you have very high packet loads. The >> second CPU on SMP keeps on doing all userland tasks and running >> routing protocols. Otherwise your BGP sessions or OSPF hellos >> would stop and remove you from the routing cloud. >> >>> - From reading the lists it appears that net.isr.direct >>> and net.ip.fastforwarding are doing similar things. Should >>> they be used together or rather not? >> >> net.inet.ip.fastforwarding has precedence over net.isr.direct and >> enabling both at the same doesn't gain you anything. Fastforwarding >> is about 30% faster than all other methods available, including >> polling. On my test machine with two em(4) and an AMD Opteron 852 >> (2.6GHz) I can route 580'000 pps with zero packet loss on -CURRENT. >> An upcoming optimization that will go into -CURRENT in the next >> few days pushes that to 714'000 pps. Futher optimizations are >> underway to make a stock kernel do close to or above 1'000'000 pps >> on the same hardware. >> >> -- >> Andre >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to " freebsd-net-unsubscribe@freebsd.org" >> > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071230172808.J7115>