Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 29 Apr 2023 14:09:17 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        FreeBSD Hackers <freebsd-hackers@freebsd.org>
Cc:        freebsd-arm <freebsd-arm@freebsd.org>
Subject:   Re: armv8.2-A+ tuned FreeBSD kernels and buildworld buildkernel times: an example
Message-ID:  <4E8BB159-38D1-4EF0-B486-DF6C8B49D7AB@yahoo.com>
In-Reply-To: <177A2369-1751-4DB5-B316-E140ED156B6E@yahoo.com>
References:  <177A2369-1751-4DB5-B316-E140ED156B6E@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Apr 29, 2023, at 12:16, Mark Millard <marklmi@yahoo.com> wrote:

> Context: all world's and kernel's involved/built are non-debug style.
>=20
> Note: clang15 through LLVM main (so far) has errors in both directions
> for the features for cortex-a78c. So I also used +flagm+nofp16fml .
> (The cortex-x1c also has such problems, but the details are
> different.)
>=20
> Notation in table below:
> CA72:  matching world or kernel had been built using -mcpu=3Dcortex-a72
> CA78C: matching world or kernel had been built using =
-mcpu=3Dcortex-a78C+flagm+nofp16fml
>=20
> System: Windows Dev Kit 2023 (4 cortex-a78c's and 4 cortex-x1c's):
> (both: armv8.2-A with a few more modern features)
>=20
> Times to build system from scratch (buildworld buildkernel from same
> sources) . . .
>=20
> System running:                   World built in: kernel built in:
> CA72  kernel, CA72  world                6601 sec          597 sec
> CA78C kernel, CA78C world                4680 sec          413 sec
> CA78C kernel, CA72  world (chroot)       4715 sec          422 sec
>=20
> The CA72/CA72 is from before I'd built the CA78C world and kernel.
> All builds used -j8 . None had competing activity on the machine.
>=20
> What this suggests is having an explicitly armv8.2+ tuned kernel
> makes a notable difference for -j8 buildworld buildkernel times
> on aarch64.

"Tuned" here includes newer-feature use, so incompatible with the
likes of armv8.0-A hardware, for example. The FEAT_LSE atomics
use would be an example. But I've done nothing to investigate
subsetting the new-feature use to isolate what makes the biggest
contributions to the elapsed-time decrease.

> The Windows Dev Kit 2023 is the first (and only) armv8.1+ based
> system that I've have access to. So testing such properties is
> limited to the one context.
>=20
> Also, I've not had access to the Windows Dev Kit 2023 for long:
> first experiments.
>=20
>=20
> Notes on my historically-usual aarch64 builds:
>=20
> On cortex-a72 hardware, my context is -mcpu=3Dcortex-a72 based. This
> once exposed a lack of sufficient synchronization in a palce in
> the USB subsystem. (Running the same system on cortex-a53 hardware
> did not fail. Running -mcpu=3Dcortex-a53 based world+kernel on a
> cortex-a72 did not fail. A cortex-a53 hardware running the
> -mcpu=3Dcortex-a53 based world+kernel did not fail.)
>=20
> Until the hardware failed, there was a time when I also had
> access to a cortex-a57 FreeBSD system.
>=20
> I do not do such -mcpu=3D tailoring on the only FreeBSD amd64 that
> I've access to, a ThreadRipper 1950X. I do such only for the lower
> end systems that I have access to. My aarch64 access is all to
> lower end, not upper end.


I should have reported that my recent activity for this is based
on: main-n262658-b347c2284603-dirty, b347c2284603 being from late
Apr 28, 2023 UTC. (The "-dirty" is from some historical patches
that I use.) Some of my activity has been from somewhat earlier
but I wanted to pick up another openzfs fix nor 2 that had
happened since then.)


=3D=3D=3D
Mark Millard
marklmi at yahoo.com




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4E8BB159-38D1-4EF0-B486-DF6C8B49D7AB>