Date: Mon, 2 Dec 2019 15:07:00 -0800 From: Mark Millard <marklmi@yahoo.com> To: freebsd-arm@freebsd.org Subject: Re: Comparing the OverDrive 1000 (A57) vs. MACCHIATObin Double Shot (A72) for buildworld and via a CPU/cache/RAM tradeoff-exploring benchmark (links corrected, again) Message-ID: <8E3A0E01-F22D-4635-A8CF-CDB98CFF9794@yahoo.com> In-Reply-To: <63787F5A-A3B7-434A-B594-999D95559BEE@yahoo.com> References: <92E7B63A-E790-4815-9D91-2161A4F66B71.ref@yahoo.com> <92E7B63A-E790-4815-9D91-2161A4F66B71@yahoo.com> <5F7E7618-A503-4D16-B83C-0379F4B6327F@yahoo.com> <63787F5A-A3B7-434A-B594-999D95559BEE@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[May be this time I'll get working links in place . . .] On 2019-Dec-2, at 14:56, Mark Millard <marklmi@yahoo.com> wrote: > [Just correcting the links to be to .png files > and correcting some PowerMac11,2 related wording.] >=20 > On 2019-Dec-2, at 14:15, Mark Millard <marklmi at yahoo.com> wrote: >=20 >> It looks like the OverDrive 1000 vs. MACCHIATObin Double >> Shot comparison ends up being an example of memory >> access making the difference for the specific workload: >> -j4 buildworld for head -r355027 (building itself >> from scratch). >>=20 >> buildworld times (not needing a llvm bootstrap build): >>=20 >> OverDrive 1000: 13895 sec (about 3.86 hrs) >> MACCHIATObin Double Shot: 16561 sec (about 4.60 hrs) >>=20 >> So a little under 45 min difference when the mean >> and geometric mean are both a little over 4.2 hrs. >>=20 >> SSD ufs file systems: One with Samsung 860 Pro, the >> other with Samsung 850 Pro. I do not expect that I/O >> made much of a difference, but I did nothing to measure >> such for the buildworld activity. >>=20 >> OverDrive RAM: 8GiByte, half in each of the 2 slots >> MACCHIATObin RAM: 16GiByte, all in its 1 slot. >>=20 >> MACCHIATObin: jumpers set for the fastest CPU/RAM >> speed for the Double Shot. >>=20 >> A comparison graph from exploring single threaded >> and multi-threaded CPU/cache and RAM limited >> performance (a variation on the old HINT serial >> and pthread benchmarks) is shown at: Corrected link (2nd try): = https://github.com/markmi/acpphint/blob/master/acpphint_example_data/acpph= int-OverDrive_1000_MacchDblShot-threads_4-LP64-g%2B%2B_9_O3-libc%2B%2B-DSI= ZE_large_fast_types-RAM.png >> There are curves for various involved types: >> double (d), unsigned long long (ull), unsigned >> long (ul), unsigned int (ui). The match for >> ull and ul for the context provides some >> evidence of the variability observed. >>=20 >> (The OverDrive and MACCHIATObin were not benchmarked >> for the graph at the same version of head: -r352341 >> based vs. -r355027 based.) >>=20 >> (I did not set things such that the benchmark run >> would explore paging getting involved. Thus there >> is basically no I/O considered in the comparison >> graph.) >>=20 >> The MACCHIATObin clearly wins single threaded and >> its memory subsystem was well matched to the single >> threaded use when the same-invovled-types are >> compared. (Single threaded are the blueish curves, >> MACCHIATObin having the lighter colors.) >>=20 >> For multi-threaded in the range where RAM access >> limits things, the two systems are a close match. >> (Greenish colors, right side of plot, upper >> curves.) >>=20 >> The range were the OverDrive 1000 is clearly faster >> is part of the middle of the multi-threaded curves. >> (This might be tied to whatever is done with the >> dual RAM slot structure or to the amount of caching, >> or some such, I do not know the details.) >>=20 >> I would expect "-j1 buildworld" would take less time >> on the MACCHIATObin than on the OverDrive, but I'm >> not planing on measuring that. >>=20 >>=20 >>=20 >> A more historical comparison, old PowerMac11,2 >> (2 sockets, 2 cores each) vs. the MACCHIATObin, >> both having 16 GiBytes of RAM: >>=20 >> For analogous benchmark graphs (matching types), >> the MACCHIATObin single threaded is faster than >> the old PowerMac11,2 single threaded and also is >> usually faster than that 11,2's multi-threaded >> benchmark data as well. >=20 > I should have pointed out that the MACCHIATObin > single threaded and PowerMac11,2 multi-threaded > results are similar where memory access limits > things, with use of double (d) being a little > slower on the MACCHIATObin in this region. >=20 >> Multi-threaded, the >> MACCHIATObin is faster for the exploration by >> the benchmark. >=20 Corrected link (2nd try): = https://github.com/markmi/acpphint/blob/master/acpphint_example_data/acpph= int-MacchDblShot_PowerMac11%2C2-threads_4-LP64-g%2B%2B_9_O3-libc%2B%2B-DSI= ZE_large_fast_types-RAM.png >> I expect that this is interesting for the likely >> difference in power usage during the benchmarking. >> (Not that I've measured the power usage.) >>=20 >> (The FreeBSD head vintages are not the same in >> the graph: -r355027 based vs. -r352341 based.) >>=20 =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8E3A0E01-F22D-4635-A8CF-CDB98CFF9794>