Date: Fri, 17 Feb 2006 10:01:01 -0500 (EST) From: Andrew Gallatin <gallatin@cs.duke.edu> To: freebsd-amd64@freebsd.org Subject: non-temporal copyin/copyout? Message-ID: <17397.58669.457047.277510@grasshopper.cs.duke.edu>
next in thread | raw e-mail | index | archive | help
Has anybody considered using non-temporal copies for the in-kernel bcopy on amd64? A quick test in userspace shows that for large copies, an adapted pagecopy (from amd64/amd64/support.S) more than doubles bcopy bandwidth from 1.2GB/s to 2.5GB/s on my on my Athlon64 X2 3800+. I'm bringing this up because I've noticed that FreeBSD 10GbE performance is far below Solaris/amd64 and linux/x86_64 when using the PCI-e 10GbE adaptor that I'm doing drivers for. For example, Solaris can recieve a netperf TCP stream at 9.75Gb/sec while using only 47% CPU as measured by vmstat. (eg, it is using a little less than a single core). In contrast, FreeBSD is limited to 7.7Gb/sec, and uses nearly 90% CPU. When profiling with hwpmc, I see a profile which shows up to 70% of the time is spent in copyout. Thanks, Drew
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?17397.58669.457047.277510>