Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 14 Apr 2010 11:34:45 -0500
From:      Adam Vande More <amvandemore@gmail.com>
To:        Andriy Gapon <avg@freebsd.org>
Cc:        alc@freebsd.org, Maho NAKATA <chat95@mac.com>, alan.l.cox@gmail.com, freebsd-stable@freebsd.org, als@modulus.org
Subject:   Re: How to reproduce: Re: Only 70% of theoretical peak performance on FreeBSD 8/amd64, Corei7 920
Message-ID:  <x2k6201873e1004140934z6f7518b9j72ffd9e1adc1ad49@mail.gmail.com>
In-Reply-To: <4BC5DEB4.1090208@freebsd.org>
References:  <h2yca3526251004122230l909bc93ey916d7fe0dd24fd33@mail.gmail.com> <4BC402B7.5000400@modulus.org> <v2gca3526251004122322i709c523ct4f93bcf75a778a8e@mail.gmail.com> <20100414.082109.29593248145846106.chat95@mac.com> <4BC5DEB4.1090208@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Apr 14, 2010 at 10:26 AM, Andriy Gapon <avg@freebsd.org> wrote:

> on 14/04/2010 02:21 Maho NAKATA said the following:
> > 4. run dgemm.
> > % ./dgemm
> > n: 3000
> > time : 134.648208 or 16.910525
> > Mflops : 31943.419695
> > n: 3100
> > time : 148.122279 or 18.615284
> > Mflops : 32017.357408
> > n: 3200
> > time : 162.488885 or 20.430651
> > Mflops : 32087.318295
> > n: 3300
> > time : 178.497079 or 22.446093
> > Mflops : 32030.420499
> > n: 3400
> > time : 195.550715 or 24.586152
> > Mflops : 31981.873273
> > n: 3500
> > time : 213.403379 or 26.825058
> > Mflops : 31975.513363
> > n: 3600
> > ...
> > above output is on Core i7 920 (2.66GHz; TurboBoost on)
>
> My results:
> $ ./dgemm
> n: 3000
> time : 54.151302 or 28.189781
> Mflops : 19162.263125
> n: 3100
> time : 60.157449 or 32.214141
> Mflops : 18501.570537
> n: 3200
> time : 65.753191 or 34.114872
> Mflops : 19216.393378
>
> CPU:
> CPU: Intel(R) Core(TM)2 Duo CPU     E7300  @ 2.66GHz (2653.35-MHz K8-clas=
s
> CPU)
>  Origin =3D "GenuineIntel"  Id =3D 0x10676  Stepping =3D 6
>
>
> Features=3D0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PG=
E,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
>
>  Features2=3D0x8e39d<SSE3,DTES64,MON,DS_CPL,EST,TM2,SSSE3,CX16,xTPR,PDCM,=
SSE4.1>
>  AMD Features=3D0x20100800<SYSCALL,NX,LM>
>  AMD Features2=3D0x1<LAHF>
>  TSC: P-state invariant
> =E2=8B=AE
> FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
> FreeBSD/SMP: 1 package(s) x 2 core(s)
>
> FreeBSD:
> FreeBSD 8.0-STABLE r205070 amd64
>
> Please note that the system was not dedicated to the test, I had
> Xorg+KDE3+thunderbird+skype+kopete+konsole(s) plus a bunch of daemons
> running.
> That probably explains irregularities in the results.
>
> I am not sure how exactly theoretical maximum should be calculated, I use=
d
> 2 *
> 2.66G * 4 =E2=89=88 21.3G.
> And so 19.2G / 21.3G =E2=89=88 90%.
>
> Not as bad as what you get.
> Although not as good as what you report for Linux.
> But given the impurity and imprecision of my test=E2=80=A6
>
> P.S. the machine is two-core obviously :-)
> Don't have anything with more cpus/cores handy.
>
> P.P.S. Having _only glimpsed_ at the source I think that there are some
> things
> that GotoBLAS doesn't try to do on FreeBSD that it tries to do on Linux.
> Like setting CPU-affinity for the threads, or avoiding HTT pseudo-cores.
> Those things are possible on FreeBSD.
> Perhaps, there are more things like that.
>
>
Mine is also a live desktop enviro, kde4+

n: 3000
time : 116.377609 or 16.696066
Mflops : 32353.729042
n: 3100
time : 127.230336 or 17.274867
Mflops : 34501.695325
n: 3200
time : 139.018175 or 18.342056
Mflops : 35741.074976
n: 3300
time : 152.519365 or 20.154714
Mflops : 35671.942364
n: 3400
time : 166.248145 or 21.952426
Mflops : 35818.874941
n: 3500
time : 182.565385 or 24.492597
Mflops : 35020.581786
n: 3600
time : 198.551018 or 26.906992
Mflops : 34689.094992
n: 3700
time : 215.428919 or 28.574964
Mflops : 35462.294838
n: 3800
^C

CPU: Intel(R) Core(TM) i7 CPU         870  @ 2.93GHz (3313.71-MHz K8-class
CPU)
  Origin =3D "GenuineIntel"  Id =3D 0x106e5  Family =3D 6  Model =3D 1e  St=
epping =3D
5

Features=3D0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,=
MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>

Features2=3D0x98e3fd<SSE3,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR=
,PDCM,SSE4.1,SSE4.2,POPCNT>
  AMD Features=3D0x28100800<SYSCALL,NX,RDTSCP,LM>
  AMD Features2=3D0x1<LAHF>
  TSC: P-state invariant

That's about 67% utilization, turning off HTT drops it more.  HTT on the
newer cores is good, not bad.





--=20
Adam Vande More



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?x2k6201873e1004140934z6f7518b9j72ffd9e1adc1ad49>