Date: Tue, 22 Sep 2020 20:16:55 -0700 From: Mark Millard <marklmi@yahoo.com> To: freebsd-arm <freebsd-arm@freebsd.org> Subject: Re: RPi4B buildworld buildkernel times for already installed system being -mcpu=cortex-a72 vs. -mcpu=cortex-a53 based Message-ID: <DC890C6C-67BE-4E93-8BEC-81102028C142@yahoo.com> In-Reply-To: <A59A0896-5A24-4C05-AD15-DCC27F25B927@yahoo.com> References: <4E155E94-3AA0-464D-A1E9-45A7827537ED@yahoo.com> <A59A0896-5A24-4C05-AD15-DCC27F25B927@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2020-Sep-20, at 18:40, Mark Millard <marklmi at yahoo.com> wrote: > On 2020-Sep-20, at 18:32, Mark Millard <marklmi at yahoo.com> wrote: >=20 >> The following are from scratch buildworld buildkernel rebuilds >> on a RPi4B (head -r363590 context). >>=20 >> ENVIRONMENT: -mcpu=3Dcortex-a72 based world and kernel running = already, RPi4B @ 2G Hz, >> Restricted to 3 GiByte RAM, -j3: >>=20 >> World built in 37469 seconds, ncpu: 4, make -j3 >> Kernel(s) GENERIC-NODBG built in 2474 seconds, ncpu: 4, make -j3 >>=20 >> ENVIRONMENT: -mcpu=3Dcortex-a53 based kernel running, RPi4B @ 2G Hz, >> Restricted to 3 GiByte RAM, -j3: >>=20 >> World built in 44034 seconds, ncpu: 4, make -j3 >> Kernel(s) GENERIC-NODBG built in 2895 seconds, ncpu: 4, make -j3 >>=20 >> So a little under 11.1 hr total vs. a little over 13.0 hr total, >> a somewhat over 50 min improvement. >=20 > "a somewhat over 1hr 50 min improvement" is what I should have > managed to type. Some experiments indicate that the faster result may be rather dependent on clang 10 -O use vs. clang 11 -O vs. -O2 use as well as use of -mcpu=3Dcortex-a72 . Jumping from clang 10 -O to clang 11 -O2 for the FreeBSD kernel build in use looks like it might revert to more like the older times for buildworld buildkernel. (clang 11 -O is -O1 instead of the historical -O2 .) But I've not rerun any build tests to know for sure. clang 11's use -f -O meaning -O1 is causing FreeBSD-kernel-build problems when DEBUG is defined --from lack of inlining in some environments. FreeBSD may switch to use of -O2 explicitly for all platforms. (I build non-debug [no witness and such] with DEBUG=3D-g forced. My context is now forcing -O2 currently because powerpc64 has the inlining problem and I'm checking if the uniform setting works uniformly across what I have access to.) >> (A xhci patch finally allowed me to boot -mcpu=3Dcortex-a72 >> based kernel builds on the RPi4B: The xhci event ring >> initialization code was missing a usb_bus_mem_flush_all >> call previously.) >>=20 >>=20 >> Supporting details: >>=20 >> (e-mail based spacing changes expected below) >>=20 >> # diff -u = ~/src.configs/src.conf.cortexA72-clang-bootstrap.aarch64-host = ~/src.configs/src.conf.cortexA53-clang-bootstrap.aarch64-host >> --- /root/src.configs/src.conf.cortexA72-clang-bootstrap.aarch64-host = 2020-03-13 22:29:25.470155000 -0700 >> +++ /root/src.configs/src.conf.cortexA53-clang-bootstrap.aarch64-host = 2020-03-13 22:29:25.469455000 -0700 >> @@ -49,9 +49,9 @@ >> # Use of the .clang 's here avoids >> # interfering with other C<?>FLAGS >> # usage, such as ?=3D usage. >> -CFLAGS.clang+=3D -mcpu=3Dcortex-a72 >> -CXXFLAGS.clang+=3D -mcpu=3Dcortex-a72 >> -CPPFLAGS.clang+=3D -mcpu=3Dcortex-a72 >> -ACFLAGS.arm64cpuid.S+=3D -mcpu=3Dcortex-a72+crypto >> -ACFLAGS.aesv8-armx.S+=3D -mcpu=3Dcortex-a72+crypto >> -ACFLAGS.ghashv8-armx.S+=3D -mcpu=3Dcortex-a72+crypto >> +CFLAGS.clang+=3D -mcpu=3Dcortex-a53 >> +CXXFLAGS.clang+=3D -mcpu=3Dcortex-a53 >> +CPPFLAGS.clang+=3D -mcpu=3Dcortex-a53 >> +ACFLAGS.arm64cpuid.S+=3D -mcpu=3Dcortex-a53+crypto >> +ACFLAGS.aesv8-armx.S+=3D -mcpu=3Dcortex-a53+crypto >> +ACFLAGS.ghashv8-armx.S+=3D -mcpu=3Dcortex-a53+crypto >>=20 >>=20 >> The .amd64-host files are similar for doing cross builds. >>=20 >> I also use +=3D in secure/lib/libcrypto/Makefile : >>=20 >> # svnlite diff /usr/src/secure/lib/libcrypto/Makefile >> Index: /usr/src/secure/lib/libcrypto/Makefile >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> --- /usr/src/secure/lib/libcrypto/Makefile (revision 365919) >> +++ /usr/src/secure/lib/libcrypto/Makefile (working copy) >> @@ -20,7 +20,7 @@ >> SRCS+=3D o_str.c o_time.c threads_pthread.c uid.c >> .if defined(ASM_aarch64) >> SRCS+=3D arm64cpuid.S armcap.c >> -ACFLAGS.arm64cpuid.S=3D -march=3Darmv8-a+crypto >> +ACFLAGS.arm64cpuid.S+=3D -march=3Darmv8-a+crypto >> .elif defined(ASM_amd64) >> SRCS+=3D x86_64cpuid.S >> .elif defined(ASM_arm) >> @@ -35,7 +35,7 @@ >> SRCS+=3D aes_cbc.c aes_cfb.c aes_ecb.c aes_ige.c aes_misc.c = aes_ofb.c aes_wrap.c >> .if defined(ASM_aarch64) >> SRCS+=3D aes_core.c aesv8-armx.S vpaes-armv8.S >> -ACFLAGS.aesv8-armx.S=3D -march=3Darmv8-a+crypto >> +ACFLAGS.aesv8-armx.S+=3D -march=3Darmv8-a+crypto >> .elif defined(ASM_amd64) >> SRCS+=3D aes_core.c aesni-mb-x86_64.S aesni-sha1-x86_64.S = aesni-sha256-x86_64.S >> SRCS+=3D aesni-x86_64.S vpaes-x86_64.S >> @@ -242,7 +242,7 @@ >> SRCS+=3D ofb128.c wrap128.c xts128.c >> .if defined(ASM_aarch64) >> SRCS+=3D ghashv8-armx.S >> -ACFLAGS.ghashv8-armx.S=3D -march=3Darmv8-a+crypto >> +ACFLAGS.ghashv8-armx.S+=3D -march=3Darmv8-a+crypto >> .elif defined(ASM_amd64) >> SRCS+=3D aesni-gcm-x86_64.S ghash-x86_64.S >> .elif defined(ASM_arm) >>=20 >> The RPi4B is using: >>=20 >> over_voltage=3D6 >> arm_freq=3D2000 >>=20 >> and was booted via uefi/ACPI. >>=20 >> I have not repeated the -j4 or other -jN comparisons that >> I reported in the past. The -mcpu=3Dcortex-a53 figures are >> from the past. >=20 =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?DC890C6C-67BE-4E93-8BEC-81102028C142>