Date: Wed, 14 Apr 2010 18:26:44 +0300 From: Andriy Gapon <avg@freebsd.org> To: Maho NAKATA <chat95@mac.com> Cc: alc@freebsd.org, alan.l.cox@gmail.com, freebsd-stable@freebsd.org, als@modulus.org Subject: Re: How to reproduce: Re: Only 70% of theoretical peak performance on FreeBSD 8/amd64, Corei7 920 Message-ID: <4BC5DEB4.1090208@freebsd.org> In-Reply-To: <20100414.082109.29593248145846106.chat95@mac.com> References: <h2yca3526251004122230l909bc93ey916d7fe0dd24fd33@mail.gmail.com> <4BC402B7.5000400@modulus.org> <v2gca3526251004122322i709c523ct4f93bcf75a778a8e@mail.gmail.com> <20100414.082109.29593248145846106.chat95@mac.com>
next in thread | previous in thread | raw e-mail | index | archive | help
on 14/04/2010 02:21 Maho NAKATA said the following: > 4. run dgemm. > % ./dgemm > n: 3000 > time : 134.648208 or 16.910525 > Mflops : 31943.419695 > n: 3100 > time : 148.122279 or 18.615284 > Mflops : 32017.357408 > n: 3200 > time : 162.488885 or 20.430651 > Mflops : 32087.318295 > n: 3300 > time : 178.497079 or 22.446093 > Mflops : 32030.420499 > n: 3400 > time : 195.550715 or 24.586152 > Mflops : 31981.873273 > n: 3500 > time : 213.403379 or 26.825058 > Mflops : 31975.513363 > n: 3600 > ... > above output is on Core i7 920 (2.66GHz; TurboBoost on) My results: $ ./dgemm n: 3000 time : 54.151302 or 28.189781 Mflops : 19162.263125 n: 3100 time : 60.157449 or 32.214141 Mflops : 18501.570537 n: 3200 time : 65.753191 or 34.114872 Mflops : 19216.393378 CPU: CPU: Intel(R) Core(TM)2 Duo CPU E7300 @ 2.66GHz (2653.35-MHz K8-class CPU) Origin = "GenuineIntel" Id = 0x10676 Stepping = 6 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0x8e39d<SSE3,DTES64,MON,DS_CPL,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1> AMD Features=0x20100800<SYSCALL,NX,LM> AMD Features2=0x1<LAHF> TSC: P-state invariant ⋮ FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs FreeBSD/SMP: 1 package(s) x 2 core(s) FreeBSD: FreeBSD 8.0-STABLE r205070 amd64 Please note that the system was not dedicated to the test, I had Xorg+KDE3+thunderbird+skype+kopete+konsole(s) plus a bunch of daemons running. That probably explains irregularities in the results. I am not sure how exactly theoretical maximum should be calculated, I used 2 * 2.66G * 4 ≈ 21.3G. And so 19.2G / 21.3G ≈ 90%. Not as bad as what you get. Although not as good as what you report for Linux. But given the impurity and imprecision of my test… P.S. the machine is two-core obviously :-) Don't have anything with more cpus/cores handy. P.P.S. Having _only glimpsed_ at the source I think that there are some things that GotoBLAS doesn't try to do on FreeBSD that it tries to do on Linux. Like setting CPU-affinity for the threads, or avoiding HTT pseudo-cores. Those things are possible on FreeBSD. Perhaps, there are more things like that. -- Andriy Gapon
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4BC5DEB4.1090208>