From owner-freebsd-arm@freebsd.org Mon Sep 28 15:36:48 2020 Return-Path: Delivered-To: freebsd-arm@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 48DEB42569C for ; Mon, 28 Sep 2020 15:36:48 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic304-24.consmr.mail.gq1.yahoo.com (sonic304-24.consmr.mail.gq1.yahoo.com [98.137.68.205]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4C0RTC1jVzz4VNH for ; Mon, 28 Sep 2020 15:36:46 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: XG3ZFZcVM1kS3M3gRpCagH_u1JKF4UjtJLPMcn22BJCal1G0J8Uenxb_INVJMT1 1W0yaG9f8FzVxn63sYmPsJMNFBSCV4wjz4dqmuKoUi7eIaY26VHWYy7bgGqhE78MMNFY4EwxY_GZ G02IjU2Dko_JZ71_limi7X28ylb1u07P2pXVkAWFYruSKxf3jDJAhMq0JZBLUnrGTRWeRZgaIh1D vtfMzqAmn00WJFM2ef0TiZfO20HwHpHiHGhV9B45G6SE2Xq5PJxwgedXHQsXKB0gepljEBEhK8RY CL0FFChmMFtdKzGfyZ1lPUHtPNZNdTaYoHEX2b2s4ihxSGoiytiEVErlis_jL4UlJvjRq7VerXGP M3jgJdeIQ70xdolcEUptfbyzIVVSKqL0wfitpHO5mF4EZZb99T40xgew367Hhi3ZUnf7ppcyJtH1 5Faa0sW725ThxML.bUsTAo7jlFmrVSxo6itdJNAV7YTGcnba9YkErmkujOlcg814lUjPRybVcusr gFrwU_awpQbc4.fFfMlt9j5ITRa5kFcfZaZQ9lHqzpIP6T0PrB298qHWh4H5x9mTWfEjDklDY9DG N3gOmst1thaiWh2JTQIvr0DhacCeQoPPTdLMmcxgDDxFmi4Ky3A5yEXUxhM0O69E2G7nr_XOBVEw kULZVyZ1IlAsjVExhuSQMZwpQAPUqfnbC9cTW7runJhmvaP9neQD04rL6fJDZXlpPIHlcNw09nbn OlBDjqaPrV8mP.vSM0sKaw4J8RKzbJ1tcpR5WHp9ohnQJEPuFDXKcU0tTk.RZzMZ3YIg1MsaaG_S VtISz0dxfqVTc4ncloUwuzFs5GXGET5SbYBrygHn629ehYtp4BnpOZfQJiSbt9vduSR9cQfxlbDn _dLl57oYGnBjo4O8vOGd62I0uoX6jt2NBUKjA3gcsc4sfvekvE5HmeDED2eAeJmDUIKXGTON8PpW i0ib50nWTBX1Nva9ZnkyFAikTXgAbRgnViDoiprwQuOS_DlfOTgknySJx2TT2yR37IdArb8N0HcH Iusv9TuQP2Fsg9._5pldpwI1Iwjlx4IG8ULxw_Js6sygoX9jwINrxSJbIE0Tkh4DpSrPjRkey_Ja k_A9AxKKPH_8MJBtggOFKxCiEqVEQNpfW8mhN315TB.XR8FGs5PhLw1OR0bsi9sB.LYnONzgfyWV bVpV6Xh53jGPcZWuYwNjMKTbUrjfar6_snRwssC_MMfKgs3eeJPZAVJoNBjEXWaKNURvU1tsW16_ g9qKTR4zeiYUjTUtK.ilAzBIG.l8TkdA3MNMRrcVqjU.59iEjZzKFBL4sL1vt0QSahzdJKSJmFwR eiIW12tbhf44ieap2U1vwv4A_Gq3.Jh5Xm__YvbBcwNT1PhV8jn8yyDoCsPyDP6Y6fUNnt3qvtLn 5wVnJqzJ9eDcEynjKMKM3405ykGu_9hTtqxUWUybaxeWrHAc5.Xly5oSiMhrn6_vWJIJzYd6rs_F _MI7GM2lIHy.cAsXnUS5xss8hVARdhLOl8qLN6.PwMkRWZa19o4VbKvu7mtIIhNmuvpaAm_sfMXK jnlkRRiUNeX1L8VYAhA-- Received: from sonic.gate.mail.ne1.yahoo.com by sonic304.consmr.mail.gq1.yahoo.com with HTTP; Mon, 28 Sep 2020 15:36:45 +0000 Received: by smtp421.mail.bf1.yahoo.com (VZM Hermes SMTP Server) with ESMTPA ID 3fc416d2e491e6b8b37060df97939366; Mon, 28 Sep 2020 15:36:42 +0000 (UTC) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.1\)) Subject: Re: RPi4B buildworld buildkernel times for already installed system being -mcpu=cortex-a72 vs. -mcpu=cortex-a53 based Date: Mon, 28 Sep 2020 08:36:40 -0700 References: <4E155E94-3AA0-464D-A1E9-45A7827537ED@yahoo.com> To: freebsd-arm In-Reply-To: Message-Id: <9CF3675E-072B-4845-A510-691508DCEF3C@yahoo.com> X-Mailer: Apple Mail (2.3608.120.23.2.1) X-Rspamd-Queue-Id: 4C0RTC1jVzz4VNH X-Spamd-Bar: -- X-Spamd-Result: default: False [-2.54 / 15.00]; FREEMAIL_FROM(0.00)[yahoo.com]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; NEURAL_HAM_SHORT(-0.02)[-0.022]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/20, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.01)[-1.008]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.01)[-1.011]; MIME_GOOD(-0.10)[text/plain]; RCPT_COUNT_ONE(0.00)[1]; RCVD_IN_DNSWL_NONE(0.00)[98.137.68.205:from]; RWL_MAILSPIKE_POSSIBLE(0.00)[98.137.68.205:from]; RCVD_COUNT_TWO(0.00)[2]; MAILMAN_DEST(0.00)[freebsd-arm] X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Sep 2020 15:36:48 -0000 [Turns out, when sdram_freq_min=3D3200 is effective, -j4 builds are = faster than -j3 builds by about an hour (holding other configuration conditions constant).] On 2020-Sep-27, at 11:07, Mark Millard wrote: > On 2020-Sep-20, at 18:40, Mark Millard wrote: >=20 >> On 2020-Sep-20, at 18:32, Mark Millard wrote: >>=20 >>> The following are from scratch buildworld buildkernel rebuilds >>> on a RPi4B (head -r363590 context). >>>=20 >>> ENVIRONMENT: -mcpu=3Dcortex-a72 based world and kernel running = already, RPi4B @ 2G Hz, >>> Restricted to 3 GiByte RAM, -j3: >>>=20 >>> World built in 37469 seconds, ncpu: 4, make -j3 >>> Kernel(s) GENERIC-NODBG built in 2474 seconds, ncpu: 4, make -j3 >>>=20 >>> ENVIRONMENT: -mcpu=3Dcortex-a53 based kernel running, RPi4B @ 2G Hz, >>> Restricted to 3 GiByte RAM, -j3: >>>=20 >>> World built in 44034 seconds, ncpu: 4, make -j3 >>> Kernel(s) GENERIC-NODBG built in 2895 seconds, ncpu: 4, make -j3 >>>=20 >>> So a little under 11.1 hr total vs. a little over 13.0 hr total, >>> a somewhat over 50 min improvement. >>=20 >> "a somewhat over 1hr 50 min improvement" is what I should have >> managed to type. >>=20 >>> (A xhci patch finally allowed me to boot -mcpu=3Dcortex-a72 >>> based kernel builds on the RPi4B: The xhci event ring >>> initialization code was missing a usb_bus_mem_flush_all >>> call previously.) >>>=20 >>>=20 >>> Supporting details: >>>=20 >>> (e-mail based spacing changes expected below) >>>=20 >>> # diff -u = ~/src.configs/src.conf.cortexA72-clang-bootstrap.aarch64-host = ~/src.configs/src.conf.cortexA53-clang-bootstrap.aarch64-host >>> --- = /root/src.configs/src.conf.cortexA72-clang-bootstrap.aarch64-host = 2020-03-13 22:29:25.470155000 -0700 >>> +++ = /root/src.configs/src.conf.cortexA53-clang-bootstrap.aarch64-host = 2020-03-13 22:29:25.469455000 -0700 >>> @@ -49,9 +49,9 @@ >>> # Use of the .clang 's here avoids >>> # interfering with other CFLAGS >>> # usage, such as ?=3D usage. >>> -CFLAGS.clang+=3D -mcpu=3Dcortex-a72 >>> -CXXFLAGS.clang+=3D -mcpu=3Dcortex-a72 >>> -CPPFLAGS.clang+=3D -mcpu=3Dcortex-a72 >>> -ACFLAGS.arm64cpuid.S+=3D -mcpu=3Dcortex-a72+crypto >>> -ACFLAGS.aesv8-armx.S+=3D -mcpu=3Dcortex-a72+crypto >>> -ACFLAGS.ghashv8-armx.S+=3D -mcpu=3Dcortex-a72+crypto >>> +CFLAGS.clang+=3D -mcpu=3Dcortex-a53 >>> +CXXFLAGS.clang+=3D -mcpu=3Dcortex-a53 >>> +CPPFLAGS.clang+=3D -mcpu=3Dcortex-a53 >>> +ACFLAGS.arm64cpuid.S+=3D -mcpu=3Dcortex-a53+crypto >>> +ACFLAGS.aesv8-armx.S+=3D -mcpu=3Dcortex-a53+crypto >>> +ACFLAGS.ghashv8-armx.S+=3D -mcpu=3Dcortex-a53+crypto >>>=20 >>>=20 >>> The .amd64-host files are similar for doing cross builds. >>>=20 >>> I also use +=3D in secure/lib/libcrypto/Makefile : >>>=20 >>> # svnlite diff /usr/src/secure/lib/libcrypto/Makefile >>> Index: /usr/src/secure/lib/libcrypto/Makefile >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>> --- /usr/src/secure/lib/libcrypto/Makefile (revision 365919) >>> +++ /usr/src/secure/lib/libcrypto/Makefile (working copy) >>> @@ -20,7 +20,7 @@ >>> SRCS+=3D o_str.c o_time.c threads_pthread.c uid.c >>> .if defined(ASM_aarch64) >>> SRCS+=3D arm64cpuid.S armcap.c >>> -ACFLAGS.arm64cpuid.S=3D -march=3Darmv8-a+crypto >>> +ACFLAGS.arm64cpuid.S+=3D -march=3Darmv8-a+crypto >>> .elif defined(ASM_amd64) >>> SRCS+=3D x86_64cpuid.S >>> .elif defined(ASM_arm) >>> @@ -35,7 +35,7 @@ >>> SRCS+=3D aes_cbc.c aes_cfb.c aes_ecb.c aes_ige.c aes_misc.c = aes_ofb.c aes_wrap.c >>> .if defined(ASM_aarch64) >>> SRCS+=3D aes_core.c aesv8-armx.S vpaes-armv8.S >>> -ACFLAGS.aesv8-armx.S=3D -march=3Darmv8-a+crypto >>> +ACFLAGS.aesv8-armx.S+=3D -march=3Darmv8-a+crypto >>> .elif defined(ASM_amd64) >>> SRCS+=3D aes_core.c aesni-mb-x86_64.S aesni-sha1-x86_64.S = aesni-sha256-x86_64.S >>> SRCS+=3D aesni-x86_64.S vpaes-x86_64.S >>> @@ -242,7 +242,7 @@ >>> SRCS+=3D ofb128.c wrap128.c xts128.c >>> .if defined(ASM_aarch64) >>> SRCS+=3D ghashv8-armx.S >>> -ACFLAGS.ghashv8-armx.S=3D -march=3Darmv8-a+crypto >>> +ACFLAGS.ghashv8-armx.S+=3D -march=3Darmv8-a+crypto >>> .elif defined(ASM_amd64) >>> SRCS+=3D aesni-gcm-x86_64.S ghash-x86_64.S >>> .elif defined(ASM_arm) >>>=20 >>> The RPi4B is using: >>>=20 >>> over_voltage=3D6 >>> arm_freq=3D2000 >>>=20 >>> and was booted via uefi/ACPI. >>>=20 >>> I have not repeated the -j4 or other -jN comparisons that >>> I reported in the past. The -mcpu=3Dcortex-a53 figures are >>> from the past. >=20 > The following new timing is based on head -r365932 rebuilding > itself where the 8 GiByte RPi4B config.txt ended with: >=20 > over_voltage=3D6 > arm_freq=3D2000 > sdram_freq_min=3D3200 >=20 > and the boot was via u-boot, no RAM restriction. (The > sdram_freq_min assignment does not seem to do anything > for rpi4-uefi-devel v1.20 uefi/ACPI based booting.) > /etc/sysctl.conf has: dev.cpu.0.freq=3D2000 . No use of > powerd or other such. >=20 >=20 > ENVIRONMENT: -mcpu=3Dcortex-a72 based world and kernel running = already, > 8 GiBYte RPi4B @ 2G Hz with sdram_freq_min=3D3200, u-boot style boot, = -j3: >=20 > World built in 31852 seconds, ncpu: 4, make -j3 > Kernel(s) GENERIC-NODBG built in 2059 seconds, ncpu: 4, make -j3 >=20 > So somewhat under 9.5 hr overall. >=20 >=20 > That means somewhat over 3.5 hours faster than a -mcpu=3Dcortex-a53 > based system without sdram_freq_min=3D3200 using 3 GiByte RAM > but still RPi4B @ 2G Hz (uefi/ACPI boot): >=20 > World built in 44034 seconds, ncpu: 4, make -j3 > Kernel(s) GENERIC-NODBG built in 2895 seconds, ncpu: 4, make -j3 >=20 > (Same as reported in prior messages.) >=20 > But the prior -r362590 vs. the now -r363932 means there is more = varying > than in my previous comparisons. For example, clang 10 vs. clang 11. >=20 > I'm probably going to run a -j4 build to see how it compares in > this context. ENVIRONMENT: -mcpu=3Dcortex-a72 based world and kernel running already, 8 GiBYte RPi4B dev.cpu.0.freq=3D2000 with sdram_freq_min=3D3200, u-boot style boot, -j4: World built in 28526 seconds, ncpu: 4, make -j4 Kernel(s) GENERIC-NODBG built in 1841 seconds, ncpu: 4, make -j4 So somewhat under 8.5 hr overall. That means somewhat over 4.5 hours faster than a -mcpu=3Dcortex-a53 based system without sdram_freq_min=3D3200 using 3 GiByte RAM but still RPi4B @ 2G Hz (uefi/ACPI boot). > I've not run a default arm-freq/sdram_freq_min/dev.cpu.0.freq = buildworld > buildkernel in a long time and so do not have reasonable comparison > figures relative to that type of context. I do not plan on such an > experiment. >=20 >=20 > I'll note that I run these tests with a monitor connected that sits > with a static login prompt display after booting. I do not not test > with X11 or other use that might significantly compete for more power. > The serial port console is usually used. I have used ssh sometimes in > the past. >=20 > ~/src.configs/src.conf.cortexA72-clang-bootstrap.aarch64-host is still > unchanged: >=20 > # more ~/src.configs/src.conf.cortexA72-clang-bootstrap.aarch64-host=20= > TO_TYPE=3Daarch64 > # > KERNCONF=3DGENERIC-NODBG > TARGET=3Darm64 > .if ${.MAKE.LEVEL} =3D=3D 0 > TARGET_ARCH=3D${TO_TYPE} > .export TARGET_ARCH > .endif > # > #WITH_CROSS_COMPILER=3D > WITH_SYSTEM_COMPILER=3D > WITH_SYSTEM_LINKER=3D > # > WITH_LIBCPLUSPLUS=3D > #WITH_LLD_BOOTSTRAP=3D > WITHOUT_BINUTILS_BOOTSTRAP=3D > WITH_ELFTOOLCHAIN_BOOTSTRAP=3D > #Disables avoiding bootstrap: WITHOUT_LLVM_TARGET_ALL=3D > WITH_LLVM_TARGET_AARCH64=3D > WITH_LLVM_TARGET_ARM=3D > WITHOUT_LLVM_TARGET_MIPS=3D > WITHOUT_LLVM_TARGET_POWERPC=3D > WITHOUT_LLVM_TARGET_RISCV=3D > WITHOUT_LLVM_TARGET_X86=3D > #WITH_CLANG_BOOTSTRAP=3D > WITH_CLANG=3D > WITH_CLANG_IS_CC=3D > WITH_CLANG_FULL=3D > WITH_CLANG_EXTRAS=3D > WITH_LLD=3D > WITH_LLD_IS_LD=3D > WITHOUT_BINUTILS=3D > WITH_LLDB=3D > # > WITH_BOOT=3D > WITHOUT_LIB32=3D > # > # > NO_WERROR=3D > #WERROR=3D > MALLOC_PRODUCTION=3D > # > # Avoid stripping but do not control host -g status as well: > DEBUG_FLAGS+=3D > # > WITH_REPRODUCIBLE_BUILD=3D > WITH_DEBUG_FILES=3D > # > # Use of the .clang 's here avoids > # interfering with other CFLAGS > # usage, such as ?=3D usage. > CFLAGS.clang+=3D -mcpu=3Dcortex-a72 > CXXFLAGS.clang+=3D -mcpu=3Dcortex-a72 > CPPFLAGS.clang+=3D -mcpu=3Dcortex-a72 > ACFLAGS.arm64cpuid.S+=3D -mcpu=3Dcortex-a72+crypto > ACFLAGS.aesv8-armx.S+=3D -mcpu=3Dcortex-a72+crypto > ACFLAGS.ghashv8-armx.S+=3D -mcpu=3Dcortex-a72+crypto >=20 =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)