Date: Sun, 04 Feb 1996 22:25:48 +0000 From: Matt Thomas <matt@lkg.dec.com> To: Peter Berger <peterb@telerama.lm.com> Cc: hackers@freebsd.org Subject: High speed Routing (was Re: Multi-Port Async Cards) Message-ID: <199602042225.WAA06856@whydos.lkg.dec.com> In-Reply-To: Your message of "Fri, 02 Feb 1996 12:11:55 EST." <Pine.BSI.3.91.960202115948.8158A-100000@ivory.lm.com>
next in thread | previous in thread | raw e-mail | index | archive | help
In <Pine.BSI.3.91.960202115948.8158A-100000@ivory.lm.com> , you wrote: > That's right; and you can't add a 100Mbp/s port to a PC that will > actually route that many packets for $134, or for any price. Reference > the very interesting TCP performance tests at Usenix which showed that at > Ethernet MTUs, Pentium boxes running TCP/IP over the loopback interface > could only reach about 40Mb/s (this number went up if you increased the > MTU ... the cost is in the packet processing, not the raw byte speed). # ifconfig lo0 mtu 1536 # ./ttcp -f m -t -s -n 8192 0 ttcp-t: buflen=8192, nbuf=8192, align=16384/0, port=5001 tcp -> 0 ttcp-t: socket ttcp-t: connect ttcp-t: 67108864 bytes in 7.61 real seconds = 67.24 Mbit/sec +++ ttcp-t: 8192 I/O calls, msec/call = 0.95, calls/sec = 1075.83 ttcp-t: 0.0user 4.1sys 0:07real 55% 26i+263d 250maxrss 0+2pf 1330+4920csw I just did that on my P90 (running 2.1.0-RELEASE) using the loopback device and an Ethernet MTU. That seems to be a bit more that 40Mb/s. However, I do agree that FreeBSD as it is today is not suitable as a high speed router. While the system does have the raw processing power to do high-speed routing, the network infrastructure is not up to the task. Not surprisingly (given its history), the IP networking code is heavily oriented to being a host, not a router. Just consider for a momemt that want to be a 100baseT router between 2 LANs. That means you should be able to gracefully receive ~150,000 packets per interface per second (if you are being flooded with tinygrams). No matter you do, you will not be able to handle 300,000 interrupts per second. You need to switch to polling (even better would be to switch to polling when the interrupt rate reaches some threshold of interrupts per clock tick) however this does a pose a problem in getting user space applications time to execute. Let's assume you can deal a 1ms of delay so you queue, say, 3 ms of buffers at each interface (1000 buffers). You don't want to queue 1.5MB of buffer space since if they really are full size packets, they only consume 30 buffers (45KB). So you need to queue smaller buffers and use intelligent buffer management and scatter/gather to brind the amount of space used. Let's use 320 byte buffers (since 5*320 > 1518) and that will bring the amount of memory to 320KB per interface. Now you to have process those 300 packets. You get 3us per packet to do all the processing for that packet. Good luck. One of the major changes I would make would be to move the netisr's to the drivers and make almost all packet receive and transmit processing in device drivers run at splnet. Only the few places where you have to twiddle the device would you need to run at splimp. ether_input (or ppp_input or ...) would call the protocol input routine directly (one could even have a inline version of ether_input which just checked for IPv4 and quickly called ip_input). The only receive queue would be the receive buffers on the device itself. Another change would be add an mbuf flag which tells the driver whether the mbuf should be immediately freed or whether it can "linger" around. This would allow the transmit code to be able not be interrupt driven (unless you dealing with, for example, NFS buffers). When you start a transmit (and there's no room), call your transmit done to free any stale buffers. Another is to code a "fast" ip_input routine specially that is tuned for the 99% case of forwarding a packet. I think I've rambled on enough for now... Matt Thomas Internet: matt@3am-software.com 3am Software Foundry WWW URL: http://www.3am-software.com/bio/matt.html Westford, MA Disclaimer: I disavow all knowledge of this message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199602042225.WAA06856>