Date: Thu, 19 Mar 1998 10:22:02 -0800 From: Amancio Hasty <hasty@rah.star-gate.com> To: lamaster@george.arc.nasa.gov Cc: freebsd-current@FreeBSD.ORG Subject: Re: Stream_d benchmark... Wow, there really are differences in Message-ID: <199803191822.KAA01912@rah.star-gate.com> In-Reply-To: Your message of "Thu, 19 Mar 1998 10:00:03 PST." <199803191800.KAA01635@george.arc.nasa.gov>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi My PCI chipset is: chip0: <Intel 82440FX (Natoma) PCI and memory controller> I am using 60ns EDO . My compilation line : gcc -o stream_d stream_d.c second_cpu.c -maligne-double -O3 Would be nice to have SDRAM four way interleaved 8) Cheers, Amancio > > For the discussion of the chipsets, I refer to my previous post. > This is just to put the numbers in one place so that they can be > compared to previous numbers. This discussion should probably be > elsewhere - in - hardware perhaps? > > > > This is an Asus motherboard dying to double or more my memory system. > > > > My PPro200 is about a 1.5 years old and I hope that the new 100Mhz bus > > based systems fair better than my system. > > > ------------------------------------------------------------- > > Function Rate (MB/s) RMS time Min time Max time > > Copy: 113.7778 0.1557 0.1406 0.1719 > > Scale: 107.7895 0.1565 0.1484 0.1719 > > Add: 118.1538 0.2158 0.2031 0.2344 > > Triad: 118.1538 0.2213 0.2031 0.2344 > > These numbers are quite good. Hard to tell how much of the > differences seen are due to board design, aggressive BIOS settings, > memory technology, and chipset. And, perhaps, compilers, although > the compiler can't do anything about the poor PPro200/Natoma write > bandwidth. These numbers seem high to me based on what I have read > previously for generic EDO. Anybody using BEDO out there? > > > > Soeren Schmidt (sos@FreeBSD.org) wrote: > > > > In reply to Jaye Mathisen who wrote: > > > > > > > > Hmm, Then I should be proud of my noname system (p6/200/128MB 72pEDO): > > > > > > > > Function Rate (MB/s) RMS time Min time Max time > > > > Copy: 117.0286 0.2758 0.2734 0.2812 > > > : > > > > Triad: 125.3878 0.3917 0.3828 0.4219 > > Higher yet for generic EDO. > > > > > > All boxes are P6-200's, 256MB RAM (all RAM is 60ns FP as far as I know). > > > > > > > > > > Box 1 is a SuperMicro P6DNE: > > > > > Function Rate (MB/s) RMS time Min time Max time > > > > > Copy: 60.7395 0.2704 0.2634 0.2832 > > > > > Triad: 71.1647 0.3494 0.3372 0.3565 > > > > > > Box 2 is a Digital Prioris HX6000 > > > > > Copy: 73.3551 0.2197 0.2181 0.2249 > > > > > Triad: 77.4268 0.3108 0.3100 0.3122 > > > > > > Box 3 is a Digital Prioris ZX6000 > > > > > Function Rate (MB/s) RMS time Min time Max time > > > > > Copy: 84.8807 0.2018 0.1885 0.2834 > > > > > Scale: 97.5461 0.1661 0.1640 0.1720 > > > > > Add: 111.6549 0.2179 0.2149 0.2247 > > > > > Triad: 100.9468 0.2659 0.2377 0.4237 > > > > > > Box 3 uses 256bit interleaved memory, rather than whatever the > > > > > "standard" is. > > > The web site for stream is http://www.cs.virginia.edu/stream > and down in ../standard/Bandwidth.html we see the following > for x86 boards tested. Note that some people have complained > of the difficulty approaching Intel's "Alder" numbers, for the > Orion chipset. That board presumably had a very aggressive > memory design, and used Orion with full memory interleaving. > Various magazines have reported on what bandwidth the consumer > actually gets in a typical system with typical software, > and the picture has usually been unpleasant. So -- > > Interesting that some of the numbers above seem to almost > reach the Alder numbers using Natoma w/ EDO. I admit I am > surprised. Here are a few numbers, with the big systems > for reference and entertainment, and the PC's at the bottom. > Note that the highest Intel board tested is a Dell PII_300; > unfortunately, chipset is not specified. Note that the > way this benchmark counts bandwidth (in and out), a copy > shows twice the bandwidth that, e.g., the *rate* of bcopy() > would show. > > > > All results are in MB/s --- 1 MB=10^6 B, *not* 2^20 B > > ------------------------------------------------------------------ > Machine ID ncpus COPY SCALE ADD TRIAD > ------------------------------------------------------------------ > > > [Big Iron - now that's memory bandwidth. About 100X the > bandwidth per CPU of the PCs. Too bad the CPUs are so > expensive.] > > NEC_SX_4 32 434784.0 432886.0 437358.0 436954.0 > NEC_SX_4 1 15983.0 15984.0 15989.0 15898.0 > Cray_T932_321024-3E 32 310721.0 302182.0 359841.0 359270.0 > Cray_T932_321024-3E 1 10653.0 10221.0 13014.0 13682.0 > Cray_C90 1 6965.4 6965.4 9378.7 9500.7 > > > > [Interesting workstation-server numbers, but, not all up to > date or the latest models.] > > SGI_Origin_2000_2 128 21857.6 23351.7 24459.5 22913.6 > SGI_Origin_2000_1 32 8556.0 8670.0 9733.0 9435.0 > SGI_Origin_2000_1 1 296.0 300.0 315.0 317.0 > IBM_RS6000-591 1 711.1 695.7 750.0 800.0 > DEC_600au_600 1 227.7 223.0 243.5 248.2 > Sun_Ultra2-2200 1 228.5 227.5 258.9 189.9 > HP_C180 1 262.3 262.3 244.9 242.4 > > > > [PC numbers, unfortunately without the chipset and memory > technology info which would help sort this out.] > > Compaq_Proliant_5000 1 123.1 114.3 141.2 126.3 > Dell_P166s 1 119.5 102.4 107.5 104.1 > Dell_Pentium_133 1 88.0 125.7 132.0 120.0 > Dell_486_DX-2-66 1 33.3 16.5 22.0 18.8 > Dell_P6_200 1 102.4 102.4 112.9 112.9 > Dell_PII_300 1 188.2 173.0 213.3 188.2 > Gateway_2000_P6-200 1 107.9 89.5 100.5 101.6 > Gateway_2000_P5-133-66 1 91.4 114.3 126.0 114.0 > Intel_Alder_Pentium_Pr 1 140.0 140.0 163.9 167.6 > Intel_Pentium-133 1 84.4 77.1 85.7 85.9 > Intel_Pentium-100 1 85.1 74.4 77.0 75.2 > Intel_Pentium-90 1 46.4 69.9 69.9 69.9 > Intel_Pentium-60 1 37.2 62.1 61.3 58.5 > PC-clone-AMD-486DX-50 1 38.1 26.2 28.6 23.3 > PC-clone-AMD-486DX-80 1 83.9 41.9 39.3 39.3 > Viglen_Pentium_60 1 47.1 61.5 63.1 60.0 > Micron_P6-200 1 98.4 97.4 106.5 105.0 > Micron_P5-120 1 79.3 100.4 109.9 107.7 > Asus_Pentium_180 1 76.2 110.3 109.1 100.0 > Asus_Pentium_200 1 84.2 123.1 123.1 111.6 > Triton_II_Pentium_133 1 93.5 113.3 116.6 110.3 > Triton_II_Pentium_133 1 75.9 85.3 87.8 85.3 > Gigabyte_586HX 1 88.9 118.5 126.3 117.1 > > > > > Note: These numbers don't tell the entire bandwidth story - > the cache hierarchy, latency, read and write bandwidth at > each level, not to mention MP performance, cache-coherency, > prefetch, multiple outstanding transactions, etc. etc. etc. > are enough to write a (large) book about. > > However, my experience is that many applications are sensitive > to bandwidth and it is worth a little effort to get the most > out the CPU. > > > -- > Hugh LaMaster, M/S 233-21, ASCII Email: hlamaster@mail.arc.nasa.gov > NASA Ames Research Center Or: lamaster@george.arc.nasa.gov > Moffett Field, CA 94035-1000 No Junkmail: USC 18 section 2701 > Phone: 650/604-1056 Disclaimer: Unofficial, personal *opinion*. > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-current" in the body of the message To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199803191822.KAA01912>