Date: Sat, 8 Feb 2020 14:28:43 -0800 From: Mark Millard <marklmi@yahoo.com> To: "jeff@freebsd.org" <jeff@FreeBSD.org> Cc: freebsd-arm@freebsd.org Subject: Re: RPi3 not using SMP? Message-ID: <E5DFA1BD-3BB4-4ED2-9E8C-E9CEA1CB0927@yahoo.com> In-Reply-To: <142E83E3-CD79-43B3-A86A-2AD45A778FB5@yahoo.com> References: <20200208011940.GA8570@www.zefox.net> <6B6CCB8F-B56A-4758-BEEC-6418718C95CB@yahoo.com> <9F1B762C-D1DA-40F6-A2D6-451B40A39E4A@yahoo.com> <CF5FC50C-FDE5-434C-BADC-2AE7D3348663@yahoo.com> <142E83E3-CD79-43B3-A86A-2AD45A778FB5@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2020-Feb-7, at 21:58, Mark Millard <marklmi at yahoo.com> wrote: > [A note on avoiding a bad interpretation of > my evidence.] >=20 > On 2020-Feb-7, at 20:12, Mark Millard <marklmi at yahoo.com> wrote: >=20 >> On 2020-Feb-7, at 19:10, Mark Millard <marklmi at yahoo.com> wrote: >>=20 >>> [I now have some bounds on when PSCI_FN_VERSION >>> gets a 0 result from smc #0 instead of the correct >>> result: 2. Hopefully I can narrow the range more.] >>>=20 >>> On 2020-Feb-7, at 18:46, Mark Millard <marklmi at yahoo.com> wrote: >>>=20 >>>=20 >>>=20 >>>> On 2020-Feb-7, at 17:19, bob prohaska <fbsd at www.zefox.net> = wrote: >>>>=20 >>>>> For some weeks now an RPi3 running -current has seemed rather = slow.... >>>>>=20 >>>>> On looking at the early part of the boot message the processor >>>>> attributes seem rather scant: >>>>>=20 >>>>> ...... >>>>> elease APs...APs not started >>>>> CPU 0: ARM Cortex-A53 r0p4 affinity: 0 >>>>> Instruction Set Attributes 0 =3D <CRC32> >>>>> Instruction Set Attributes 1 =3D <> >>>>> Processor Features 0 =3D <AdvSIMD,FP,EL3 32,EL2 32,EL1 32,EL0 = 32> >>>>> Processor Features 1 =3D <> >>>>> Memory Model Features 0 =3D <TGran4,TGran64,SNSMem,BigEnd,16bit = ASID,1TB PA> >>>>> Memory Model Features 1 =3D <8bit VMID> >>>>> Memory Model Features 2 =3D <32bit CCIDX,48bit VA> >>>>> Debug Features 0 =3D <2 CTX BKPTs,4 Watchpoints,6 = Breakpoints,PMUv3,Debugv8> >>>>> Debug Features 1 =3D <> >>>>> Auxiliary Features 0 =3D <> >>>>> Auxiliary Features 1 =3D <> >>>>> CPU 1: (null) (null) r0p0 affinity: 0 >>>>> CPU 2: (null) (null) r0p0 affinity: 0 >>>>> CPU 3: (null) (null) r0p0 affinity: 0 >>>>> ............ >>>>> In a top window, STATE is reported as RUN, rather than the >>>>> former CPUn. >>>>>=20 >>>>> Is a software switch now required to enable multiprocessing? >>>>>=20 >>>>> Or, could it be related to the lines: >>>>> psci0: PSCI version number mismatched with DT=20 >>>>> as pointed out by Mark M in reference to the cpu_reset failed >>>>> problem, which is still manifest?=20 >>>>>=20 >>>>> The kernel is at r357644. >>>>=20 >>>> Head -r356767's kernel does not have this problem for RPi3/4 used = as >>>> aarch64 FreeBSD. >>>>=20 >>>> Head -r356776 and later all have this problem for both RPi3 and = RPi4. >>>>=20 >>>> Note: There are no head versions between those. >>>>=20 >>>> The console log shows evidence of the problem much earlier. >>>> Instead of saying: >>>>=20 >>>> psci0: <ARM Power State Co-ordination Interface Driver> on ofwbus0 >>>> psci0: PSCI version 0.2 compatible >>>>=20 >>>> (once) it says: >>>>=20 >>>> psci0: <ARM Power State Co-ordination Interface Driver> on ofwbus0 >>>> psci0: PSCI version number 0 mismatched with DT, default 2 >>>> device_attach: psci0 attach returned 6 >>>>=20 >>>> (and those 3 lines repeat in various places) for which none of >>>> them show up for -r356767 . >>>>=20 >>>> Without identifying and using PSCI, the extra cores will not >>>> start and the cpu(s) will not reset. (PSCI is the ARM interface >>>> for doing such things.) >>>>=20 >>>> I've no clue why, but the version number it gets in my RPi4 >>>> experiments is 0. My only guess is that at some point >>>> memory important to ARM's PSCI operation has been touched >>>> and is no longer valid for the PSCI operation. >>>=20 >>> I've been doing some crude printf-based information >>> gathering for RPi4B boots (based on head -r357529 just >>> because it is convenient in my context) and I have >>> learned some about the staging of when PSCI_FN_VERSION >>> works vs. when it is no longer working. >>>=20 >>> For the below text extraction from a boot . . . >>> Lines with: "PSCI_FN_VERSION 2" are working cases. >>> Lines with: "PSCI_FN_VERSION 0" are no-longer-working cases. >>>=20 >>> . . . >>> FreeBSD clang version 9.0.1 (git@github.com:llvm/llvm-project.git = c1a0a213378a458fbea1a5c77b315c7dce08fd05) (based on LLVM 9.0.1) >>> uma_startup1 start: PSCI_FN_VERSION 2 >>> uma_startup1 end: PSCI_FN_VERSION 2 >>> uma_startup2 start: PSCI_FN_VERSION 2 >>> uma_startup2 end: PSCI_FN_VERSION 2 >>> VT(efifb): resolution 1824x984 >>> module firmware already present! >>> psci_fdt_get_callfn: method 'smc' >>> psci_fdt_get_callfn: arm_smccc_hvc '0xffff0000008663b0' >>> psci_fdt_get_callfn: arm_smccc_smc '0xffff0000008663c8' >>> psci_init: PSCI_FN_VERSION 0 >>> Starting CPU 1 (1) >>> Starting CPU 2 (2) >>> Starting CPU 3 (3) >>> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs >>> random: unblocking device. >>> uma_startup3 start: PSCI_FN_VERSION 0 >>> uma_startup3 start: PSCI_FN_VERSION 0 >>> . . . >>>=20 >>> So sometime after uma_startup2 ends but before psci_init >>> sets up the normal use of arm_smccc_smc things are >>> messed up. >>>=20 >>> My guess is that one or more of the kernel memory >>> allocations ended up getting memory used by ARM's >>> PSCI implementation and the content was invalidated >>> for ARM's PSCI purposes. >>>=20 >>=20 >> The transition from working to not is before the >> debug.verbose_sysinit=3D1 output turns in to >> symbolic notation: >>=20 >> . . . >> subsystem 1800000 >> mi_startup: PSCI_FN_VERSION 2 >> 0xffff00000047e170(0)... done. >> mi_startup: PSCI_FN_VERSION 2 >> 0xffff0000007bda18(0)... done. >> mi_startup: PSCI_FN_VERSION 2 >> 0xffff0000004407e4(0)... done. >> mi_startup: PSCI_FN_VERSION 2 >> 0xffff0000004400c0(0xffff000000a75890)... done. >> mi_startup: PSCI_FN_VERSION 2 >> 0xffff0000004400c0(0xffff000000a76340)... done. >> . . . >> mi_startup: PSCI_FN_VERSION 2 >> 0xffff0000004400c0(0xffff000000aca328)... done. >> mi_startup: PSCI_FN_VERSION 2 >> 0xffff0000004400c0(0xffff000000aca378)... done. >> mi_startup: PSCI_FN_VERSION 0 >> 0xffff0000004400c0(0xffff000000aca5f8)... done. >> . . . >> mi_startup: PSCI_FN_VERSION 0 >> 0xffff0000004c62b0(0)... done. >> mi_startup: PSCI_FN_VERSION 0 >> 0xffff00000049e93c(0)... done. >> mi_startup: PSCI_FN_VERSION 0 >> linker_preload(0)... Preloaded elf kernel "/boot/kernel/kernel" at = 0xffff000001568000. >> Preloaded elf module "/boot/kernel/umodem.ko" at 0xffff000001571020. >> Preloaded elf module "/boot/kernel/ucom.ko" at 0xffff000001571838. >> Preloaded boot_entropy_cache "/boot/entropy" at 0xffff000001572010. >> module firmware already present! >> done. >> subsystem 1ffffff >> mi_startup: PSCI_FN_VERSION 0 >> ucom_init(0)... done. >> subsystem 2000000 >> mi_startup: PSCI_FN_VERSION 0 >> procs_show_all_add(0)... done. >> . . . >>=20 >>=20 >> It turns out that 0xffff0000004400c0 is: malloc_init . >>=20 >> It turns out that 0xffff000000aca378 is for: M_IFMADDR : >>=20 >> . . . >>=20 >=20 > Do not take the above as an indication of a stable stage for the > change to happen. As my activities change the memory allocation > patterns and the memory layout some (and possibly other sources > of variability are involved), where in the boot sequence moves > around. >=20 > For example, while the above was before the output > turned to symbolic notation, that need not be the > case in general. >=20 > So it is just an example of where is possible and limits a > kind of activity that is sufficient for the change in status > to happen, since it is a fairly specific example context. >=20 > The way things move around, I'm not likely to come up with > a narrower type of activity spanning the status change. >=20 > I have no evidence on if the Excluded Memory Regions > are sufficient or respected. >=20 > For reference, I'd used >=20 > set debug.verbose_sysinit=3D1 > boot -v >=20 > for the above example and my own printf's were > involved. And I based this testing on head > -r357529 instead of directly on -r356776 . The > kernel was a non-debug build (with symbols). >=20 FYI: I ran into a note on the web reporting that for the RPi4: The SoC does not seem to feature a secure memory controller of any kind, = so portions of DRAM can=E2=80=99t be protected properly from the = Non-secure world. ( = https://trustedfirmware-a.readthedocs.io/en/latest/plat/rpi4.html#tf-a-por= t-design ) =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E5DFA1BD-3BB4-4ED2-9E8C-E9CEA1CB0927>