Date: Thu, 18 Mar 2021 13:01:06 -0700 From: Mark Millard <marklmi@yahoo.com> To: bob prohaska <fbsd@www.zefox.net> Cc: freebsd-arm@freebsd.org Subject: Re: RPI4 clock speeds and serial port Message-ID: <E4CF6642-CB70-4495-A865-05469953561C@yahoo.com> In-Reply-To: <9FFA0A51-C0B7-4121-95CA-B98669809007@yahoo.com> References: <20210318170053.GA26688@www.zefox.net> <9FFA0A51-C0B7-4121-95CA-B98669809007@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2021-Mar-18, at 12:05, Mark Millard <marklmi@yahoo.com> wrote: > On 2021-Mar-18, at 10:00, bob prohaska <fbsd at www.zefox.net> wrote: >=20 >>> Having now got an 8GB Pi4 to use for FreeBSD the machine seems to >>> be relatively slow at buildworld, taking some 18 hours for a clean >>> start buildworld. Sysctl -a | grep freq reports in part: >>>=20 >>> .... >>>=20 >>> hw.cpufreq.turbo: 0 >>> hw.cpufreq.sdram_freq: 400000000 >>> hw.cpufreq.core_freq: 200000000 >>> hw.cpufreq.arm_freq: 600000000 >>> hw.clock.108MHz-clock.frequency: 0 >>> hw.clock.27MHz-clock.frequency: 0 >>> hw.clock.otg.frequency: 0 >>> hw.clock.osc.frequency: 0 >>> dev.iicbus.0.frequency: 100000 >>> dev.cpufreq.0.freq_driver: bcm2835_cpufreq0 >>> dev.cpufreq.0.%parent: cpu0 >>> dev.cpufreq.0.%pnpinfo:=20 >>> dev.cpufreq.0.%location:=20 >>> dev.cpufreq.0.%driver: cpufreq >>> dev.cpufreq.0.%desc:=20 >>> dev.cpufreq.%parent:=20 >>> dev.bcm2835_cpufreq.0.freq_settings: 1500/-1 600/-1 >>> dev.bcm2835_cpufreq.0.%parent: cpu0 >>> dev.bcm2835_cpufreq.0.%pnpinfo:=20 >>> dev.bcm2835_cpufreq.0.%location:=20 >>> dev.bcm2835_cpufreq.0.%driver: bcm2835_cpufreq >>> dev.bcm2835_cpufreq.0.%desc: CPU Frequency Control >>> dev.bcm2835_cpufreq.%parent:=20 >>> dev.cpu.0.freq_levels: 1500/-1 600/-1 >>> dev.cpu.0.freq: 600 >>> dev.iichb.0.frequency: 100000 >>>=20 >>> /boot/msdos/config.txt contains >>>=20 >>> root@nemesis:~ # more /boot/msdos/config.txt >>> [all] >>> arm_64bit=3D1 >>> dtparam=3Daudio=3Don,i2c_arm=3Don,spi=3Don >>> dtoverlay=3Dmmc >>> dtoverlay=3Ddisable-bt >>> device_tree_address=3D0x4000 >>> kernel=3Du-boot.bin >>>=20 >>> [pi4] >>> #hdmi_safe=3D1 >>> armstub=3Darmstub8-gic.bin >>>=20 >>> There is no /boot/msdos/cmdline.txt file. >>>=20 >>> Can one change the cpu speed without disturbing the serial console >>> by using something like=20 >>>=20 >>> arm_freq=3D1750 >>>=20 >>> in config.txt, provided adequate cooling provisions are made? >>>=20 >>> I'd rather not complicate use of the serial console at this point. >>=20 >=20 > I've never had the CPU clock rate or memory clock rate > mess with the serial console behavior. I forgot to mention that the claim is for the context: dtoverlay=3Ddisable-bt or: dtoverlay=3Dminiuart-bt These avoid the miniUART being used for the serial console. The miniUART is cpu/gpu speed dependent. > You are not clear if your build was one that > reported messages like: >=20 > make[1]: "/usr/fbsd/mm-src/Makefile.inc1" line 339: SYSTEM_COMPILER: = Determined that CC=3Dcc matches the source tree. Not bootstrapping a = cross-compiler. > make[1]: "/usr/fbsd/mm-src/Makefile.inc1" line 344: SYSTEM_LINKER: = Determined that LD=3Dld matches the source tree. Not bootstrapping a = cross-linker. >=20 > The builds will take much longer when the bootstrap > compiler and linker are also built. >=20 >=20 >=20 > I presume u-boot style booting below, not UEFI/ACPI. >=20 > Overall to cut buildworld buildkernel > times I've controlled: >=20 > voltage and current (power) > cooling > cpu clock rate > ram clock rate > the code generation's tuning > system built targeting non-debug buildworld buildkernel > (but I still cause -g to be in use) >=20 > The details follow. >=20 > I use: >=20 > # more /boot/efi/config.txt > arm_64bit=3D1 > enable_uart=3D1 > uart_2ndstage=3D1 > dtdebug=3D1 > disable_commandline_tags=3D1 > disable_overscan=3D1 > device_tree_address=3D0x4000 > dtoverlay=3Ddisable-bt > dtoverlay=3Dmmc > armstub=3Darmstub8-gic.bin > kernel=3Du-boot.bin > gpu_mem_1024=3D32 > # For speeding things up: > over_voltage=3D6 > arm_freq=3D2000 > arm_freq_min=3D2000 > sdram_freq_min=3D3200 >=20 > The last 4 lines are tied to the use of > faster clocking. In my context, an > over_voltage use is required. I've > not tried to figure out the minimal > (low amrgin) change, just a sufficient > one. >=20 > Note that the above also controls the > RAM speed, not just the CPU speed. The > difference was rather significant for > my buildworld buildkernel timing > experiments, even with the cpu frequency > already controlled to be 2000 MHz. >=20 > Your: >=20 >>> dev.cpu.0.freq_levels: 1500/-1 600/-1 >=20 > It not a list of the possible clock speeds. > It is just a list of which ones are compatible > with the arm_freq that you assign (or the > default if unassigned). In other words, for > example, using arm_freq=3D2000 will lead to > seeing a different list of levels. >=20 > I'll note that every one of the 6 RPi4B that > I've had access to has operated at > arm_freq=3D2000 just fine when properly configured. > I run them all that way. >=20 > I also use in /etc/sysctl.conf : >=20 > # The u-boot'ed RPi4B does not seem to automatically > # adjust from 600MHz, # so do so manually. Presumes > # config.txt does over_voltage=3D6 and arm_freq=3D2000 . > # NOTE: without an appropriate over_voltage a > # dev.cpu.0.freq=3D2000 will crash the RPi4B on the > # spot. > dev.cpu.0.freq=3D2000 >=20 > (So I do not use powerd .) This goes along with > the config.txt having "arm_freq_min=3D2000". So > in my context, the RPi4B's run at a basically > constant speed (cpu and ram). >=20 > I use /etc/sysctl.conf above because: >=20 > # sysctl -aW | grep freq | more > kern.acct_chkfreq: 15 > debug.cpufreq.verbose: 0 > debug.cpufreq.lowest: 0 > hw.cpufreq.voltage_sdram_p: 1100000 > hw.cpufreq.voltage_sdram_i: 1100000 > hw.cpufreq.voltage_sdram_c: 1100000 > hw.cpufreq.voltage_core: 1000000 > hw.cpufreq.turbo: 0 > hw.cpufreq.sdram_freq: -1094967296 > hw.cpufreq.core_freq: 200000000 > hw.cpufreq.arm_freq: 2000000000 > dev.cpu.0.freq: 2000 >=20 > but: >=20 > # sysctl -aT | grep freq | more > debug.cpufreq.verbose: 0 > debug.cpufreq.lowest: 0 > debug.uart_poll_freq: 50 >=20 > so the setting would not work in > /boot/loader.conf . (-aT shows what > loader.conf can set and -aW shows > what /etc/sysctl.conf can set. Some > things are listed for both overall, > others are not.) >=20 > (Too bad hw.cpufreq.sdram_freq is displayed > as a signed value: it really is not.) >=20 > The RPi4B's that I have access to all have > heatsinks and are actively cooled (fans). > (The detailed case styles vary significantly > but all are effectively cooled.) >=20 > I use CanaKit USB-C 5.1V 3.5A power supplies > and use a USB3 SSD plugged in directly with > no external power. The 3.5A is somewhat > more than the standard RPi* power supply > has for the RPi4B (3.1A) and I wanted the > margin for the USB3 SSD use. >=20 > I do one more thing to speed up operation > of the RPI4B, but it is in how code is > generated by buildworld buildkernel. My > equivalent of src.conf uses: >=20 > # more ~/src.configs/src.conf.cortexA72-clang-bootstrap.aarch64-host > . . . > # > # Use of the .clang 's here avoids > # interfering with other C<?>FLAGS > # usage, such as ?=3D usage. > CFLAGS.clang+=3D -mcpu=3Dcortex-a72 > CXXFLAGS.clang+=3D -mcpu=3Dcortex-a72 > CPPFLAGS.clang+=3D -mcpu=3Dcortex-a72 > ACFLAGS.arm64cpuid.S+=3D -mcpu=3Dcortex-a72+crypto > ACFLAGS.aesv8-armx.S+=3D -mcpu=3Dcortex-a72+crypto > ACFLAGS.ghashv8-armx.S+=3D -mcpu=3Dcortex-a72+crypto >=20 > This controls the tuning used in the code > generation for buildworld and buildkernel. > Once the resulting system was in use, this > also had a significant effect on the later > build times for buildworld buildkernel . >=20 > There can be a large difference between a > cortex-a53's strictly in-order execution > and the cortex-a72's out of order allowed > execution. (The same architecture vintage > allows for both styles.) I expect that > this difference is why the tuning mattered > so much: arranging for out of order to be > an advantage. >=20 > I use -mcpu because it sets both the -march > and the -mtune to match the cpu type and > is sorter than listing both explicitly. >=20 > The kernel configuration is shown below. >=20 > # more sys/arm64/conf/GENERIC-NODBG=20 > # > # GENERIC -- Custom configuration for the arm64/aarch64 > # >=20 > include "GENERIC" >=20 > ident GENERIC-NODBG >=20 > makeoptions DEBUG=3D-g # Build kernel with gdb(1) = debug symbols >=20 > options ALT_BREAK_TO_DEBUGGER >=20 > options KDB # Enable kernel debugger = support >=20 > # For minimum debugger support (stable branch) use: > #options KDB_TRACE # Print a stack trace for a = panic > options DDB # Enable the kernel debugger >=20 > # Extra stuff: > #options VERBOSE_SYSINIT=3D0 # Enable verbose sysinit = messages > #options BOOTVERBOSE=3D1 > #options BOOTHOWTO=3DRB_VERBOSE > #options KTR > #options KTR_MASK=3DKTR_TRAP > ##options KTR_CPUMASK=3D0xF > #options KTR_VERBOSE >=20 > # Disable any extra checking for. . . > nooptions DEADLKRES # Enable the deadlock resolver > nooptions INVARIANTS # Enable calls of extra sanity = checking > nooptions INVARIANT_SUPPORT # Extra sanity checks of = internal structures, required by INVARIANTS > nooptions WITNESS # Enable checks to detect = deadlocks and cycles > nooptions WITNESS_SKIPSPIN # Don't run witness on = spinlocks for speed > nooptions DIAGNOSTIC > nooptions MALLOC_DEBUG_MAXZONES # Separate malloc(9) zones > nooptions BUF_TRACKING > nooptions FULL_BUF_TRACKING >=20 >=20 >=20 > Just FYI: >=20 > # sysctl -a | grep freq | more > kern.timecounter.tc.ARM MPCore Timecounter.frequency: 54000000 > kern.eventtimer.et.ARM MPCore Eventtimer.frequency: 54000000 > kern.acct_chkfreq: 15 > debug.cpufreq.verbose: 0 > debug.cpufreq.lowest: 0 > debug.uart_poll_freq: 50 > hw.cpufreq.temperature: 34525 > hw.cpufreq.voltage_sdram_p: 1100000 > hw.cpufreq.voltage_sdram_i: 1100000 > hw.cpufreq.voltage_sdram_c: 1100000 > hw.cpufreq.voltage_core: 1000000 > hw.cpufreq.turbo: 0 > hw.cpufreq.sdram_freq: -1094967296 > hw.cpufreq.core_freq: 200000000 > hw.cpufreq.arm_freq: 2000000000 > hw.clock.108MHz-clock.frequency: 0 > hw.clock.27MHz-clock.frequency: 0 > hw.clock.otg.frequency: 0 > hw.clock.osc.frequency: 0 > dev.cpufreq.0.freq_driver: bcm2835_cpufreq0 > dev.cpufreq.0.%parent: cpu0 > dev.cpufreq.0.%pnpinfo:=20 > dev.cpufreq.0.%location:=20 > dev.cpufreq.0.%driver: cpufreq > dev.cpufreq.0.%desc:=20 > dev.cpufreq.%parent:=20 > dev.bcm2835_cpufreq.0.freq_settings: 2000/-1 600/-1 > dev.bcm2835_cpufreq.0.%parent: cpu0 > dev.bcm2835_cpufreq.0.%pnpinfo:=20 > dev.bcm2835_cpufreq.0.%location:=20 > dev.bcm2835_cpufreq.0.%driver: bcm2835_cpufreq > dev.bcm2835_cpufreq.0.%desc: CPU Frequency Control > dev.bcm2835_cpufreq.%parent:=20 > dev.cpu.0.freq_levels: 2000/-1 600/-1 > dev.cpu.0.freq: 2000 >=20 > (Again: Too bad hw.cpufreq.sdram_freq is > displayed as a signed value: it really > is not signed.) >=20 > FYI: The system is based on main 7381bbee29df > (2021-03-12), the build that fixed the >=20 > # ~/fbsd-based-on-what-freebsd-main.sh=20 > merge-base: 7381bbee29df959e88ec59866cf2878263e7f3b2 > merge-base: CommitDate: 2021-03-12 20:29:42 +0000 > def0058cc690 (HEAD -> mm-src) mm-src snapshot for mm's patched build = in git context. > 7381bbee29df (freebsd/main, freebsd/HEAD, pure-src, main) cam: Run all = XPT_ASYNC ccbs in a dedicated thread > FreeBSD RPi4B 14.0-CURRENT FreeBSD 14.0-CURRENT = mm-src-n245445-def0058cc690 GENERIC-NODBG arm64 aarch64 1400005 1400005 I've not done temperate testing in a while. One quick report I can make is that for my "always cpu_freq 2000 always sdram_freq 3200" contexts with the RPI4B only doing whatever background tasks are involved in being basically idle but having the heatsinks and active cooling (fans): # sysctl -a | grep temper hw.cpufreq.temperature: 35012 dev.cpu.0.temperature: 35.0C in a currently 15.5C or so ambient context is rather typical. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E4CF6642-CB70-4495-A865-05469953561C>