Date: Thu, 20 Oct 2005 11:25:25 +0100 (BST) From: Robert Watson <rwatson@FreeBSD.org> To: Michael VInce <mv@roq.com> Cc: freebsd-net@freebsd.org, stable@freebsd.org Subject: Re: Network performance 6.0 with netperf Message-ID: <20051020110409.A24208@fledge.watson.org> In-Reply-To: <435754CB.3060501@roq.com> References: <434FABCC.2060709@roq.com> <20051014205434.C66245@fledge.watson.org> <43564800.3010309@roq.com> <435754CB.3060501@roq.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 20 Oct 2005, Michael VInce wrote: > Interestingly when testing from the gateway it self (B) direct to server > (C) having 'net.isr.direct=1' slowed down performance to 583mbits/sec net.isr.direct works to improve performance in many cases because it (a) reduces latency, and (b) reduces CPU usage. However, there are cases where it can effectively reduce performance because it reduces the opportunity for parallelism in those cases. Specifically, by constraining computation in in-bound IP path to occuring in a single thread rather than two (ithread vs. ithread and netisr), it prevents that computation from being executed on more than one CPU at a time. Understanding these cases is complicated by the fact that there may be multiple ithreads involved. Let me propose a scenario, which we may be able to confirm by looking at the output of top -S on the system involved: In the two-host test case, your experimental host is using three threads to process packets: the network interface ithread, the netisr thread, and the netserver thread. In the three host test case, where your experimental host is the forwarding system, you are also using three threads: the two interface ithreads, and the netisr. For the two-host case with net.isr.direct, work is split over these threads usefully, such that they form an execution pipeline passing data from CPU to CPU, and getting useful parallelism. Specifically, you are likely seeing significant parallelism between the ithread and the netisr. By turning on net.isr.direct, the in-bound IP stack processing occurs entirely in the ithread, with no work in the netisr, so parallelism is reduced, reducing the rate of work performed due to more synchronous waiting for CPU resources. Another possible issue here is increased delays in responding to interrupts due to high levels of work occuring in the ithread, and therefore more packets dropped from the card. In the three host case with net.isr.direct, all work occurs in the two ithreads, so IP processing in both directions can occur in parallel, whereas without net.isr.direct, all the IP processing happens in a single thread, limiting parallelism. The test to run is to have top -S running on the boxes, and see how much CPU is used by various threads in various test scenarios, and what the constraining resource is on the boxes. For example, if in the net.isr.direct scenario with two hosts, if the ithread for your ethernet interface is between 95% and 100% busy, but with net.isr.direct=0 the work is better split over threads, it might confirm the above description. On the other hand, if in both scenarios, the CPUs and threads aren't maxed out, it might suggest a problem with responsiveness to interrupts and packets dropped in the card, in which case card statistics might be useful to look at. Robert N M Watson > B> /usr/local/netperf/netperf -l 10 -H server-C -t TCP_STREAM -i 10,2 -I 99,5 > -- -m 4096 -s 57344 -S 57344 > Elapsed Throughput - 10^6bits/sec: 583.57 > > Same test with 'net.isr.direct=0' > Elapsed Throughput - 10^6bits/sec: 868.94 > I have to ask how can this be possible if when its being used as a router > with net.isr.direct=1 it passes traffic at over 900mbits/sec > Having net.inet.ip.fastforwarding=1 doesn't affect the performance in these B > to C tests. > > I believe faster performance may still be possible as another rack of gear I > have that has another AMD64 6.0 RC1 Dell 2850 (Kes) gives me up to > 930mbits/sec in apache fetch tests, I believe its even faster here because > its an AMD64 Apache server or its possible it could just have a bit better > quality ether cables, as I mentioned before the Apache server for box "C" in > above tests is i386 on 6.0RC1. > > This fetch test is only on a switch with no router between them. > spin> fetch -o - > /dev/null http://kes/500megs.zip > - 100% of 610 MB 93 MBps > > So far from this casual testing I have discovered these things on my servers. > Using 6.0 on SMP servers gives a big boost in network performance over 5.x > SMP using i386 or AMD64 > FreeBSD as router on gigabit ethernet with the use of polling gives over x2 > performance with the right sysctls. > Needs more testing but it appears using AMD64 FreeBSD might be better then > i386 for Apache2 network performance on SMP kernels. > Single interface speeds tests from the router with polling enabled and with > 'net.isr.direct=1' appears to affect performance. > > Regards, > Mike > > Michael VInce wrote: > >> Robert Watson wrote: >> >>> >>> On Fri, 14 Oct 2005, Michael VInce wrote: >>> >>>> I been doing some network benchmarking using netperf and just simple >>>> 'fetch' on a new network setup to make sure I am getting the most out of >>>> the router and servers, I thought I would post some results in case some >>>> one can help me with my problems or if others are just interested to see >>>> the results. >>> >>> >>> >>> Until recently (or maybe still), netperf was compiled with -DHISTOGRAM by >>> our port/package, which resulted in a significant performance drop. I >>> believe that the port maintainer and others have agreed to change it, but >>> I'm not sure if it's been committed yet, or which packages have been >>> rebuilt. You may want to manually rebuild it to make sure -DHISTOGRAM >>> isn't set. >>> >>> You may want to try setting net.isr.direct=1 and see what performance >>> impact that has for you. >>> >>> Robert N M Watson >> >> >> I reinstalled the netperf to make sure its the latest. >> >> I have also decided to upgrade Server-C (the i386 5.4 box) to 6.0RC1 and >> noticed it gave a large improvement of network performance with a SMP >> kernel. >> >> As with the network setup ( A --- B --- C ) with server B being the >> gateway, doing a basic 'fetch' from the gateway (B) to the Apache server >> (C) it gives up to 700mbits/sec transfer performance, doing a fetch from >> server A thus going through the gateway gives slower but still decent >> performance of up to 400mbits/sec. >> >> B> fetch -o - > /dev/null http://server-c/file1gig.iso >> - 100% of 1055 MB 69 MBps >> 00m00s >> >> >> A> fetch -o - > /dev/null http://server-c/file1gig.iso >> - 100% of 1055 MB 39 MBps >> 00m00s >> >> Netperf from the gateway directly to the apache server (C) 916mbits/sec >> B> /usr/local/netperf/netperf -l 20 -H server-C -t TCP_STREAM -i 10,2 -I >> 99,5 -- -m 4096 -s 57344 -S 57344 >> Elapsed Throughput - 10^6bits/sec: 916.50 >> >> Netperf from the client machine through the gateway to the apache server >> (C) 315mbits/sec >> A> /usr/local/netperf/netperf -l 10 -H server-C -t TCP_STREAM -i 10,2 -I >> 99,5 -- -m 4096 -s 57344 -S 57344 >> Elapsed Throughput - 10^6bits/sec: 315.89 >> >> Client to gateway netperf test shows the direct connection between these >> machines is fast. 912mbits/sec >> A> /usr/local/netperf/netperf -l 30 -H server-B -t TCP_STREAM -i 10,2 -I >> 99,5 -- -m 4096 -s 57344 -S 5734 >> Elapsed Throughput - 10^6bits/sec: 912.11 >> >> The strange thing now is in my last post I was able to get faster speeds >> from server A to C with 'fetch' tests on non-smp kernels and slower speeds >> with netperf tests. Now I get speeds a bit slower with fetch tests but >> faster netperf speed tests with or without SMP on server-C. >> >> I was going to test with 'net.isr.dispatch' but the sysctl doesn't appear >> to exist, doing this returns nothing. >> 'sysctl -a | grep 'net.isr.dispatch' >> >> I also tried polling but its also like that doesn't exist either. >> ifconfig em3 inet 192.168.1.1 netmask 255.255.255.224 polling >> ifconfig: polling: Invalid argument >> >> When doing netperf tests there was high interrupt usage. >> CPU states: 0.7% user, 0.0% nice, 13.5% system, 70.0% interrupt, 15.7% >> idle >> >> Also the server B is using its last 2 gigabit ethernet ports which are >> listed from pciconf -lv as '82547EI Gigabit Ethernet Controller' >> While the first 2 are listed as 'PRO/1000 P' >> Does any one know if the PRO/1000P would be better? >> >> em0@pci5:4:0: class=0x020000 card=0x118a8086 chip=0x108a8086 rev=0x03 >> hdr=0x00 >> vendor = 'Intel Corporation' >> device = 'PRO/1000 P' >> >> em3@pci9:8:0: class=0x020000 card=0x016d1028 chip=0x10768086 rev=0x05 >> hdr=0x00 >> vendor = 'Intel Corporation' >> device = '82547EI Gigabit Ethernet Controller' >> >> Cheers, >> Mike >> >>> >>>> >>>> The network is currently like this, where machines A and B are the Dell >>>> 1850s and C is the 2850 x 2 CPU (Server C has Apache2 worker MPM on it) >>>> and server B is the gateway and A is acting as a client for fetch and >>>> netperf tests. >>>> A --- B --- C >>>> The 2 1850s are running AMD64 Freebsd 6.0rc1 (A and B) while C is running >>>> 5.4-stable i386 from Oct 12 >>>> >>>> My main problem is that if I compile SMP into the machine C (5.4stable) >>>> the network speed goes down to a range between 6mbytes/sec to >>>> 15mbytes/sec on SMP. >>>> If I use GENERIC kernel the performance goes up to what I have show below >>>> which is around 65megabytes/sec for a 'fetch' get test from Apache server >>>> and 933mbits/sec for netperf. >>>> Does any know why why network performance would be so bad on SMP? >>>> >>>> Does any one think that if I upgrade the i386 SMP server to 6.0RC1 the >>>> SMP network performance would improve? This server will be running java >>>> so I need it to be stable and is the the reason I am using i386 and Java >>>> 1.4 >>>> >>>> I am happy with performance of direct machine to machine (non SMP) which >>>> is pretty much full 1gigabit/sec speeds. >>>> Going through the gateway server-B seems to drop its speed down a bit for >>>> in and out direction tcp speed tests using netperf I get around >>>> 266mbits/sec from server A through gateway Server-B to server-C which is >>>> quite adequate for the link I currently have for it. >>>> >>>> Doing a 'fetch' get for a 1gig file from the Apache server gives good >>>> speeds of close to 600mbits/sec but netperf shows its weakness with >>>> 266mbits/sec. >>>> This is as fast as I need it to be but does any one know the weak points >>>> on the router gateway to make it faster? Is this the performance I should >>>> expect for FreeBSD as a router with gigabit ethers? >>>> >>>> I have seen 'net.inet.ip.fastforwarding' in some peoples router setups on >>>> the list but nothing about what it does or what it can affect. >>>> I haven't done any testing with polling yet but if I can get over >>>> 900mbits/sec on the interfaces does polling help with passing packets >>>> from one interface to the other? >>>> All machines have PF running other then that they don't really have any >>>> sysctls or special kernel options. >>>> >>>> Here are some speed benchmarks using netperf and 'fetch' gets. >>>> >>>> Server A to server C with server C using SMP kernel and just GENERIC >>>> kernel further below >>>> >>>> B# /usr/local/netperf/netperf -l 10 -H server-C -t TCP_STREAM -i 10,2 -I >>>> 99,5 -- -m 4096 -s 57344 -S 57344 >>>> TCP STREAM TEST to server-C : +/-2.5% @ 99% conf. : histogram >>>> Recv Send Send >>>> Socket Socket Message Elapsed >>>> Size Size Size Time Throughput >>>> bytes bytes bytes secs. 10^6bits/sec >>>> >>>> 57344 57344 4096 10.06 155.99 >>>> tank# fetch -o - > /dev/null http://server-C/file1gig.iso >>>> - 100% of 1055 MB 13 MBps >>>> 00m00s >>>> >>>> ##### Using generic non SMP kernel >>>> Server A to server C with server C using GENERIC kernel. >>>> A# fetch -o - > /dev/null http://server-C/file1gig.iso >>>> - 100% of 1055 MB 59 MBps >>>> 00m00s >>>> >>>> A# ./tcp_stream_script server-C >>>> >>>> /usr/local/netperf/netperf -l 60 -H server-C -t TCP_STREAM -i 10,2 -I >>>> 99,5 -- -m 4096 -s 57344 -S 57344 >>>> >>>> Recv Send Send >>>> Socket Socket Message Elapsed >>>> Size Size Size Time Throughput >>>> bytes bytes bytes secs. 10^6bits/sec >>>> >>>> 57344 57344 4096 60.43 266.92 >>>> >>>> ------------------------------------ >>>> ############################################### >>>> Connecting from server-A to B (gateway) >>>> A# ./tcp_stream_script server-B >>>> >>>> ------------------------------------ >>>> >>>> /usr/local/netperf/netperf -l 60 -H server-B -t TCP_STREAM -i 10,2 -I >>>> 99,5 -- -m 4096 -s 57344 -S 57344 >>>> >>>> TCP STREAM TEST to server-B : +/-2.5% @ 99% conf. : histogram >>>> Recv Send Send >>>> Socket Socket Message Elapsed >>>> Size Size Size Time Throughput >>>> bytes bytes bytes secs. 10^6bits/sec >>>> >>>> 57344 57344 4096 61.80 926.82 >>>> >>>> ------------------------------------ >>>> ########################################## >>>> Connecting from server B (gateway) to server C >>>> Fetch and Apache2 test >>>> B# fetch -o - > /dev/null http://server-C/file1gig.iso >>>> - 100% of 1055 MB 74 MBps >>>> 00m00s >>>> >>>> Netperf test >>>> B# /usr/local/netperf/tcp_stream_script server-C >>>> >>>> /usr/local/netperf/netperf -l 60 -H server-C -t TCP_STREAM -i 10,2 -I >>>> 99,5 -- -m 4096 -s 57344 -S 57344 >>>> >>>> TCP STREAM TEST to server-C : +/-2.5% @ 99% conf. : histogram >>>> Recv Send Send >>>> Socket Socket Message Elapsed >>>> Size Size Size Time Throughput >>>> bytes bytes bytes secs. 10^6bits/sec >>>> >>>> 57344 57344 4096 62.20 933.94 >>>> >>>> ------------------------------------ >>>> >>>> Cheers, >>>> Mike >>>> >>>> _______________________________________________ >>>> freebsd-net@freebsd.org mailing list >>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >>>> >>> _______________________________________________ >>> freebsd-stable@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable >>> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" >> >> >> >> _______________________________________________ >> freebsd-stable@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-stable >> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20051020110409.A24208>