Date: Mon, 2 Dec 2019 14:31:36 -0800 From: Mark Millard <marklmi@yahoo.com> To: freebsd-arm@freebsd.org Subject: Re: Comparing the OverDrive 1000 (A57) vs. MACCHIATObin Double Shot (A72) for buildworld and via a CPU/cache/RAM tradeoff-exploring benchmark Message-ID: <5F7E7618-A503-4D16-B83C-0379F4B6327F@yahoo.com> In-Reply-To: <92E7B63A-E790-4815-9D91-2161A4F66B71@yahoo.com> References: <92E7B63A-E790-4815-9D91-2161A4F66B71.ref@yahoo.com> <92E7B63A-E790-4815-9D91-2161A4F66B71@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[The links I sent are to the gnuplot .gp files, not to copies of the plots. I've not even uploaded the plots yet. (too distracted today, I guess.) I will later upload .png files and resend with corrected links. Nothing new in this note below.] On 2019-Dec-2, at 14:15, Mark Millard <marklmi at yahoo.com> wrote: > It looks like the OverDrive 1000 vs. MACCHIATObin Double > Shot comparison ends up being an example of memory > access making the difference for the specific workload: > -j4 buildworld for head -r355027 (building itself > from scratch). >=20 > buildworld times (not needing a llvm bootstrap build): >=20 > OverDrive 1000: 13895 sec (about 3.86 hrs) > MACCHIATObin Double Shot: 16561 sec (about 4.60 hrs) >=20 > So a little under 45 min difference when the mean > and geometric mean are both a little over 4.2 hrs. >=20 > SSD ufs file systems: One with Samsung 860 Pro, the > other with Samsung 850 Pro. I do not expect that I/O > made much of a difference, but I did nothing to measure > such for the buildworld activity. >=20 > OverDrive RAM: 8GiByte, half in each of the 2 slots > MACCHIATObin RAM: 16GiByte, all in its 1 slot. >=20 > MACCHIATObin: jumpers set for the fastest CPU/RAM > speed for the Double Shot. >=20 > A comparison graph from exploring single threaded > and multi-threaded CPU/cache and RAM limited > performance (a variation on the old HINT serial > and pthread benchmarks) is shown at: >=20 > = https://github.com/markmi/acpphint/blob/master/acpphint_example_data/acpph= int-OverDrive_1000_MacchDblShot-threads_4-LP64-g%2B%2B_9_8.3_O3-libc%2B%2B= _libstdc%2B%2B-DSIZE_large_fast_types-RAM.gp >=20 > There are curves for various involved types: > double (d), unsigned long long (ull), unsigned > long (ul), unsigned int (ui). The match for > ull and ul for the context provides some > evidence of the variability observed. >=20 > (The OverDrive and MACCHIATObin were not benchmarked > for the graph at the same version of head: -r352341 > based vs. -r355027 based.) >=20 > (I did not set things such that the benchmark run > would explore paging getting involved. Thus there > is basically no I/O considered in the comparison > graph.) >=20 > The MACCHIATObin clearly wins single threaded and > its memory subsystem was well matched to the single > threaded use when the same-invovled-types are > compared. (Single threaded are the blueish curves, > MACCHIATObin having the lighter colors.) >=20 > For multi-threaded in the range where RAM access > limits things, the two systems are a close match. > (Greenish colors, right side of plot, upper > curves.) >=20 > The range were the OverDrive 1000 is clearly faster > is part of the middle of the multi-threaded curves. > (This might be tied to whatever is done with the > dual RAM slot structure or to the amount of caching, > or some such, I do not know the details.) >=20 > I would expect "-j1 buildworld" would take less time > on the MACCHIATObin than on the OverDrive, but I'm > not planing on measuring that. >=20 >=20 >=20 > A more historical comparison, old PowerMac11,2 > (2 sockets, 2 cores each) vs. the MACCHIATObin, > both having 16 GiBytes of RAM: >=20 > For analogous benchmark graphs (matching types), > the MACCHIATObin single threaded is faster than > the old PowerMac11,2 single threaded and also is > usually faster than that 11,2's multi-threaded > benchmark data as well. Multi-threaded, the > MACCHIATObin is faster for the exploration by > the benchmark. >=20 > = https://github.com/markmi/acpphint/blob/master/acpphint_example_data/acpph= int-MacchDblShot_PowerMac11%2C2-threads_4-LP64-g%2B%2B_9_O3-libc%2B%2B-DSI= ZE_large_fast_types-RAM.gp >=20 > I expect that this is interesting for the likely > difference in power usage during the benchmarking. > (Not that I've measured the power usage.) >=20 > (The FreeBSD head vintages are not the same in > the graph: -r355027 based vs. -r352341 based.) >=20 =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5F7E7618-A503-4D16-B83C-0379F4B6327F>