Date: Sat, 30 Apr 2022 01:46:07 -0700 From: Mark Millard <marklmi@yahoo.com> To: "freebsd-arm@freebsd.org" <arm@freebsd.org> Subject: Re: FYI: RPi* firmware tagged 1.20210805 *is* the last to be bootable by FreeBSD via fdt use; sequence of 2 failure modes after that Message-ID: <4231C088-0156-4BFF-8B7E-BEBE76CB15B5@yahoo.com> In-Reply-To: <D6A51253-2F17-4EC4-9EEB-3AE9C899EE53@yahoo.com> References: <F43C3E9B-D391-4278-B038-185DA7BF71B0@yahoo.com> <836019DE-531E-4B49-8A82-0D8F84885C21@yahoo.com> <8291972B-A953-4FD9-AA67-9F1F15AA3940@yahoo.com> <D6A51253-2F17-4EC4-9EEB-3AE9C899EE53@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[1.20210805 works if I avoid deleting a file that I should not have deleted.] On 2022-Apr-27, at 23:47, Mark Millard <marklmi@yahoo.com> wrote: > [Just an FYI: I got ahold of the RPi3B and discovered that > it was not bootable via RPi* firmware tagged 1.20210805 . Turns out I had deleted a required firmware file from 1.20210805 without noticing. I finally noticed that I'd done so after ending up with a 1.20210727 also did not work in the RPi3B --for the same stupid-operator-error reason. > . . . (junk text removed) . . . >=20 > [I've not added to the below and have removed the long text > block of RPi4B boot failure output.] >=20 > On 2022-Apr-24, at 05:36, Mark Millard <marklmi@yahoo.com> wrote: >=20 >> [I may have also found what leads to the extra messages for >> the 2nd failure mode, an independent issue it turns out.] >>=20 >> On 2022-Apr-24, at 04:37, Mark Millard <marklmi@yahoo.com> wrote: >>=20 >>> [I think I found the reason for the boot crash that is >>> a common failure to both failure modes. The 2nd mode >>> has other issues I've not analyzed.] >>>=20 >>> On 2022-Apr-23, at 23:45, Mark Millard <marklmi@yahoo.com> wrote: >>>=20 >>>> The following is based on a microsd card with 13.1-RC4 on >>>> it were I'd previously substituted my U-Boot 2022.04 build >>>> and tested with the RPi* firmware that is in the 13.1-RC4 >>>> image. Here I've tried replacing the RPi* firmware and >>>> holding the rest constant. >>>>=20 >>>> The boot tests are on a 8 GiByte RPi4B Rev 1.14 with the >>>> B0T stepping. I've not been copying over the linux kernels, >>>> which they also bundle with the firmware. >>>>=20 >>>> [13.1-RC4 is just what I happened to use. I doubt anything >>>> here is special to 13.* or stable/13 or main [so: 14]. >>>> (I do not use 12.* or stable/12.)] >>>>=20 >>>> The observed status went like . . . >>>>=20 >>>>=20 >>>> firmware-1.20210805/boot/ >>>>=20 >>>> The RPi* release tagged 1.20210805 is the last version that >>>> FreeBSD booted with. (Other than booting, logging in, and >>>> shutting down, I've not been testing other aspects of >>>> operation.) >>>>=20 >>>> =46rom what I've read, firmware-1.20210805/boot/ should be >>>> recent enough to handle the Rev 1.15 related PMIC variation. >>>>=20 >>>> [I'll note that firmware build dates need not be the same day >>>> as the date encoded into the tag --in fact it is usually some >>>> earlier day. On rare occasion it can be a lot earlier, and >>>> there is an example of that below.] >>>>=20 >>>>=20 >>>> After firmware-1.20210805 there are 2 major failure modes. >>>> Both stop at the same sort of point in the messaging --but >>>> there is a huge difference in the count of earlier error >>>> messages. It looks to me like all the issues require >>>> FreeBSD changes if modern RPi* firmware/dtb's are to be >>>> usable via fdt. >>>=20 >>> I've noticed a difference between the working context and >>> the failing ones (both failure modes). >>>=20 >>> Failing: >>>=20 >>> spi0: <BCM2708/2835 SPI controller> mem 0x7e204000-0x7e2041ff irq 18 = on simplebus0 >>> spibus0: <OFW SPI bus> on spi0 >>> spibus0: <unknown card> at cs 0 mode 0 >>> spibus0: <unknown card> at cs 1 mode 0 >>> NOTE BELOW LINES MISSING HERE. >>> sdhci_bcm0: <Broadcom 2708 SDHCI controller> mem = 0x7e300000-0x7e3000ff irq 24 on simplebus0 >>>=20 >>> Working: >>>=20 >>> spi0: <BCM2708/2835 SPI controller> mem 0x7e204000-0x7e2041ff irq 18 = on simplebus0 >>> spibus0: <OFW SPI bus> on spi0 >>> spibus0: <unknown card> at cs 0 mode 0 >>> spibus0: <unknown card> at cs 1 mode 0 >>> START LINES MISSING ABOVE >>> iichb0: <BCM2708/2835 BSC controller> mem 0x7e804000-0x7e804fff irq = 26 on simplebus0 >>> bcm_dma0: <BCM2835 DMA Controller> mem 0x7e007000-0x7e007aff irq = 30,31,32,33,34,35,36,37,38,39,40 on simplebus0 >>> bcmwd0: <BCM2708/2835 Watchdog> mem = 0x7e100000-0x7e100113,0x7e00a000-0x7e00a023,0x7ec11000-0x7ec1101f on = simplebus0 >>> bcmrng0: <Broadcom BCM2835/BCM2838 RNG> mem 0x7e104000-0x7e104027 on = simplebus0 >>> gpioc1: <GPIO controller> on gpio1 >>> END LINES MISSING ABOVE >>> sdhci_bcm0: <Broadcom 2708 SDHCI controller> mem = 0x7e300000-0x7e3000ff irq 73 on simplebus0 >>>=20 >>> In particular: >>>=20 >>> bcm_dma0: <BCM2835 DMA Controller> mem 0x7e007000-0x7e007aff irq = 30,31,32,33,34,35,36,37,38,39,40 on simplebus0 >>>=20 >>> being missing means no bcm_dma_attach and that in turn means >>> that the static bcm_dma_sc =3D=3D NULL still. >>>=20 >>> The panic was: panic: vm_fault failed: ffff000000862134 >>>=20 >>> where: >>>=20 >>> ffff000000862134 <bcm_dma_allocate+0x88> ldaxr x1, [x9] >>>=20 >>> which is part of: >>>=20 >>> int >>> bcm_dma_allocate(int req_ch) >>> { >>> struct bcm_dma_softc *sc =3D bcm_dma_sc; >>> int ch =3D BCM_DMA_CH_INVALID; >>> int i; >>>=20 >>> if (req_ch >=3D BCM_DMA_CH_MAX) >>> return (BCM_DMA_CH_INVALID); >>>=20 >>> /* Auto(req_ch < 0) or CH specified */ >>> mtx_lock(&sc->sc_mtx); >>> . . . >>>=20 >>> So the likes of &sc->sc_mtx end up being a small offset >>> from address zero: >>>=20 >>> x9: 20 >>>=20 >>> Thus the panic. >>>=20 >>> As to how bcm_dma_allocate happened without bcm_dma_attach >>> happening first . . . >>>=20 >>> The working context's dtb has the ordering: >>> (I also show mmcnr@ and the brcm,bcm2711-dma >>> just for reference.) >>>=20 >>> dma@7e007000 { >>> compatible =3D "brcm,bcm2835-dma"; >>> . . . >>> mmc@7e300000 { >>> compatible =3D "brcm,bcm2835-mmc", = "brcm,bcm2835-sdhci"; >>> . . . >>> mmcnr@7e300000 { >>> compatible =3D "brcm,bcm2835-mmc", = "brcm,bcm2835-sdhci"; >>> . . . >>> dma@7e007b00 { >>> compatible =3D "brcm,bcm2711-dma"; >>>=20 >>> But the failing context's dtb has the ordering: >>> (I also show mmcnr@ and the brcm,bcm2711-dma >>> just for reference.) >>>=20 >>> mmc@7e300000 { >>> compatible =3D "brcm,bcm2835-mmc", = "brcm,bcm2835-sdhci"; >>> . . . >>> dma@7e007000 { >>> compatible =3D "brcm,bcm2835-dma"; >>> . . . >>> mmcnr@7e300000 { >>> compatible =3D "brcm,bcm2835-mmc", = "brcm,bcm2835-sdhci"; >>> . . . >>> dma@7e007b00 { >>> compatible =3D "brcm,bcm2711-dma"; >>>=20 >>> So, for sequential handling in the failing case, the dma@7e007000 >>> would use bcm_dma_allocate before the bcm_dma_probe/bcm_dma_attach >>> sequence had happened, leading to the crash. >>>=20 >>> Note: I used "fdt print /" from U-Boot to get the dtb and its >>> ordering. This was based on the address that the RPi* firmware >>> reports when debugging output is enabled (0x4000 here). >>>=20 >>>=20 >>>> The 1st mode happens for (I've added the -fails notation): >>>>=20 >>>> firmware-1.20210831-fails/boot/ >>>> firmware-1.20210928-fails/boot/ >>>> firmware-1.20211007-fails/boot/ >>>> firmware-1.20211029-fails/boot/ >>>> firmware-1.20211118-fails/boot/ >>>> firmware-1.20220308_buster-fails/boot/ >>>> (The _buster one has firmware from 2021-Dec-01, which >>>> is before all the tagged releases listed below. >>>> It looks like the switch to the new major kernel >>>> version after buster came with other changes that >>>> FreeBSD has not tracked.) >>>>=20 >>>>=20 >>>> The 2nd mode happens for the following. (Again with extra >>>> notation.) There are a lot more error messages before the >>>> panic happens for these. The firmware builds for these >>>> are more recent than for the above list. >>>>=20 >>>>=20 >>>> firmware-1.20220118-fails/boot/ >>>>=20 >>>> firmware-1.20220120-fails/boot/ >>>> firmware-1.20220308-fails-non-kernels-same-as-1.20220120/boot/ >>>> (I did not repeat the testing of the unchanged firmware. >>>> I just did the "diff -r" to discover the lack of change.) >>>>=20 >>>> firmware-1.20220328-fails/boot/ >>>> = firmware-1.20220331-fails-non-kernels-same-as-firmware-1.20220328-but-for-= bcm2711-dtb-files/boot/ >>>> (Since the .dtb for the RPi4B was different, I did test this.) >>=20 >> It looks like the extra messages, blocks of: >>=20 >> clk_fixed4: <Fixed clock> disabled on ofwbus0 >> clk_fixed4: Cannot FDT parameters. >> device_attach: clk_fixed4 attach returned 6 >>=20 >> Are tied to new dtb content in 2022's dtb updates: >>=20 >> cam1_clk { >> compatible =3D "fixed-clock"; >> #clock-cells =3D <0x00000000>; >> status =3D "disabled"; >> phandle =3D <0x000000e2>; >> }; >> . . . >> cam0_clk { >> compatible =3D "fixed-clock"; >> #clock-cells =3D <0x00000000>; >> status =3D "disabled"; >> phandle =3D <0x000000e4>; >> }; >>=20 >> These 2 did not exist back when the 1st failure mode >> started. They appear to be repeatedly processed from >> not really being handled --leading to lots of >> messages. >>=20 >> The messages may just be noise for activity that is >> not contributing to boot failures at all. So fixing >> what I called the 1st failure mode might actually fix >> booting for all the firmware versions after the >> version tagged 1.20210805 . >>=20 >>>> The failures look like (each test shown) . . . >>>>=20 >>>>=20 >>>> . . . >>>=20 >=20 =3D=3D=3D Mark Millard marklmi at yahoo.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4231C088-0156-4BFF-8B7E-BEBE76CB15B5>