Date: Sat, 23 Sep 2006 00:09:33 +0200 From: Andre Oppermann <andre@freebsd.org> To: Andrew Gallatin <gallatin@cs.duke.edu> Cc: alc@freebsd.org, freebsd-net@freebsd.org, freebsd-current@freebsd.org, tegge@freebsd.org Subject: Re: Much improved sendfile(2) kernel implementation Message-ID: <45145F1D.8020005@freebsd.org> In-Reply-To: <17683.63162.919620.114649@grasshopper.cs.duke.edu> References: <4511B9B1.2000903@freebsd.org> <17683.63162.919620.114649@grasshopper.cs.duke.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
Andrew Gallatin wrote: > > Between TSO and your sendfile changes, things are looking up! > > Here are some Myri10GbE 1500 byte results from a 1.8GHz UP > FreeBSD/amd64 machine (AMD Athlon(tm) 64 Processor 3000+) sending to a > 2.0GHz SMP Linux/x86_64 machine (AMD Athlon(tm) 64 X2 Dual Core Processor > 3800+) running 26.17.7smp and our 1.1.0 Myri10GE driver (with LRO). > I used a linux receiver because LRO is the only way to receive > standard frames at line rate (without a TOE). > > These tests are all for sendfile of a 10MB file in /var/tmp: > > % netperf242 -Hrome-my -tTCP_SENDFILE -F /var//tmp/zot -T,1 -c -C -- -s393216 You should use -m5M as well. netperf is kinda dumb and does only socket buffer sized sendfile calls whereas sendfile really works best (especially new-sendfile) when it chew on a really big chunk of file without having to return to userland for every ~380k in this case. > The -T,1 is required to force the netserver to use a different core > than the interrupt handler is bound to on the linux machine. BTW, > it would be really nice if FreeBSD supported CPU affinity for processes > and interrupt handlers.. I have a gross version of that in my tree. The kernel itself supports it but it's not yet exposed to userland for manual intervention. > I did a number of runs with TSO and the patch applied and found that > setting the send-side socket buffer size to 393216 gave the best > performance in that case. I used this size for all tests, but it is > possible there is a different sweet spot for other configurations. > Note that linux auto-tunes socket buffer sizes, so I omitted the -- > -s393216 for linux. We're getting there too. First for the send buffer. Again some gross code in my tree. Not really tested yet though. > > Recv Send Send Utilization Service Demand > Socket Socket Message Elapsed Send Recv Send Recv > Size Size Size Time Throughput local remote local remote > bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB > > Without patch: > 87380 393216 393216 10.00 2163.08 100.00 19.35 3.787 1.466 > Without patch + TSO: > 87380 393216 393216 10.00 4367.18 71.54 42.07 1.342 1.578 > With patch: > 87380 393216 393216 10.01 1882.73 86.15 18.43 3.749 1.604 > With patch + TSO: > 87380 393216 393216 10.00 6961.08 47.69 60.11 0.561 1.415 Be a bit careful with the CPU usage figures. The numbers netperf reports differ quite a bit from those reported by time(1) on the high side. And there are some differences in the approach how FreeBSD and Linux do their statistical measurements of user and system time. This doesn't change the throughput number though. But see the -m5M option. New sendfile is really optimized to chew on a large file (larger than the socket buffer size) as it normally happens in reality. > For comparision, if I reboot the sender into RHEL (Linux 2.6.9-11.EL x86_64): > 87380 65536 65536 10.01 9333.00 28.98 75.23 0.254 1.321 > > > The above results are the median result for 5 runs at each setting. How large is the variance between the runs? -- Andre
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?45145F1D.8020005>