Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 29 Mar 2020 00:52:26 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        Marek Zarychta <zarychtam@plan-b.pwste.edu.pl>
Cc:        freebsd-arm <freebsd-arm@freebsd.org>
Subject:   Re: FYI: head -r358966 -> -r359376 and RPi4: -r359376 fails to boot (unless I use boot -v) where -r358966 booted fine before update
Message-ID:  <512454AB-47CC-49EC-9E67-1111C7AAC471@yahoo.com>
In-Reply-To: <eff57f3d-097f-ce15-f919-61f1971e8eb0@plan-b.pwste.edu.pl>
References:  <B501E3CD-A76E-4D9F-A7AA-70F2D2087BBC@yahoo.com> <147DDCEF-C081-4237-A81E-AEBCD71AB016@yahoo.com> <AB6A78AD-DB89-446E-B150-CA3AC8BE0B67@yahoo.com> <eff57f3d-097f-ce15-f919-61f1971e8eb0@plan-b.pwste.edu.pl>

next in thread | previous in thread | raw e-mail | index | archive | help


On 2020-Mar-29, at 00:12, Marek Zarychta <zarychtam at =
plan-b.pwste.edu.pl> wrote:

> W dniu 29.03.2020 o 04:48, Mark Millard via freebsd-arm pisze:
>> [Just correcting a persistent version number typo:
>> head -r359376 is correct, not -r359736 . Subject
>> corrected too.]
>>=20
>> On 2020-Mar-28, at 18:18, Mark Millard <marklmi atyahoo.com> wrote:
>>=20
>>> On 2020-Mar-28, at 17:14, Mark Millard <marklmi at yahoo.com> wrote:
>>>=20
>>>> I use a microsd card that is set up for booting both
>>>> a Rock64 and a RPi4: the dd'd u-boot vs. the RPi4
>>>> specific materials are in independent places and
>>>> the rest is shared and rather generic.
>>>>=20
>>>> So at head -r358966 I'd been able to both the
>>>> Rock64 and the RPi4 from the same media.
>>>>=20
>>>> Now with head -r359736 in place instead:
>> Make that: -r358376 .
>>=20
>>>> A) The Rock64 boots via that media just fine.
>>>>=20
>>>> B) The RPi4 fails to boot (nothing special
>>>> like "boot -v").
>>>>=20
>>>> C) The RPi4 with "boot -v" boots just fine.
>>>> (This makes identifying the issue non-obvious.)
>>>>=20
>>> Booting the old kernel seems to consistently
>>> work (unload, load, boot sequence).
>>>=20
>>> boot -v of the new kernel can fail.
>>>=20
>>> Plain boot of the new kernel can on occasion
>>> boot.
>>>=20
>>> This makes for more comparable output
>>> difference checking . . .
>>>=20
>>>=20
>>> Dealing with pain boot 1st (then I'll
>>> show the boot -v comparison), I show
>>> just differences in the captured output
>>> . . .
>>>=20
>>>=20
>>> EFI boot manager: Cannot load any image
>>> 679248 bytes read in 91 ms (7.1 MiB/s)
>>> libfdt fdt_check_header(): FDT_ERR_BADMAGIC
>>>=20
>>> vs.
>>>=20
>>> EFI boot manager: Cannot load any image
>>> 679248 bytes read in 90 ms (7.2 MiB/s)
>>> libfdt fdt_check_header(): FDT_ERR_BADMAGIC
>>>=20
>>>=20
>>> Booting [/boot/kernel/kernel]...              =20
>>>=20
>>> vs.
>>>=20
>>> Booting [/boot/kernel/kernel] in 9 seconds...=20
>>>=20
>>> Type '?' for a list of commands, 'help' for more detailed help.
>>> OK boot
>>>=20
>>>=20
>>> Extra lines on "it boots" case, after the first 2
>>> "REGSITER DUMP"s, starting inside the 3rd
>>>=20
>>> sdhci_bcm1-slot0: Argument: 0x000001aa | Trn mode: 0x00000000
>>> sdhci_bcm1-slot0: Present:  0x000f0000 | Host ctl: 0x00000001
>>> sdhci_bcm1-slot0: Power:    0x0000000f | Blk gap:  0x00000000
>>> sdhci_bcm1-slot0: Wake-up:  0x00000000 | Clock:    0x0000fa07
>>> sdhci_bcm1-slot0: Timeout:  0x00000000 | Int stat: 0x00000000
>>> sdhci_bcm1-slot0: Int enab: 0x01ff00bb | Sig enab: 0x01ff00bb
>>> sdhci_bcm1-slot0: AC12 err: 0x00000000 | Host ctl2:0x00000000
>>> sdhci_bcm1-slot0: Caps:     0x00000000 | Caps2:    0x00000000
>>> sdhci_bcm1-slot0: Max curr: 0x00000001 | ADMA err: 0x00000000
>>> sdhci_bcm1-slot0: ADMA addr:0x00000000 | Slot int: 0x00000000
>>> sdhci_bcm1-slot0: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=

>>> sdhci_bcm1-slot0: Got command interrupt 0x00030000, but there is no =
active command.
>>> sdhci_bcm1-slot0: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =
REGISTER DUMP =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>>> sdhci_bcm1-slot0: Sys addr: 0x00000000 | Version:  0x00009902
>>> sdhci_bcm1-slot0: Blk size: 0x00000000 | Blk cnt:  0x00000000
>>> sdhci_bcm1-slot0: Argument: 0x000001aa | Trn mode: 0x00000000
>>> sdhci_bcm1-slot0: Present:  0x000f0000 | Host ctl: 0x00000001
>>> sdhci_bcm1-slot0: Power:    0x0000000f | Blk gap:  0x00000000
>>> sdhci_bcm1-slot0: Wake-up:  0x00000000 | Clock:    0x0000fa07
>>> sdhci_bcm1-slot0: Timeout:  0x00000000 | Int stat: 0x00000000
>>> sdhci_bcm1-slot0: Int enab: 0x01ff00bb | Sig enab: 0x01ff00bb
>>> sdhci_bcm1-slot0: AC12 err: 0x00000000 | Host ctl2:0x00000000
>>> sdhci_bcm1-slot0: Caps:     0x00000000 | Caps2:    0x00000000
>>> sdhci_bcm1-slot0: Max curr: 0x00000001 | ADMA err: 0x00000000
>>> sdhci_bcm1-slot0: ADMA addr:0x00000000 | Slot int: 0x00000000
>>> sdhci_bcm1-slot0: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=

>>> sdhci_bcm1-slot0: Got command interrupt 0x00030000, but there is no =
active command.
>>> sdhci_bcm1-slot0: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =
REGISTER DUMP =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>>> sdhci_bcm1-slot0: Sys addr: 0x00000000 | Version:  0x00009902
>>> sdhci_bcm1-slot0: Blk size: 0x00000000 | Blk cnt:  0x00000000
>>>=20
>>>=20
>>> Extra lines on the "it fails to boot" side:
>>>=20
>>> sdhci_bcm0-slot0: Controller timeout
>>> sdhci_bcm0-slot0: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =
REGISTER DUMP =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>>> sdhci_bcm0-slot0: Sys addr: 0x000006c8 | Version:  0x00001002
>>> sdhci_bcm0-slot0: Blk size: 0x00000200 | Blk cnt:  0x00000001
>>> sdhci_bcm0-slot0: Argument: 0x0ee2afc1 | Trn mode: 0x00000036
>>> sdhci_bcm0-slot0: Present:  0x1fff0a06 | Host ctl: 0x00000007
>>> sdhci_bcm0-slot0: Power:    0x0000000f | Blk gap:  0x00000080
>>> sdhci_bcm0-slot0: Wake-up:  0x00000000 | Clock:    0x00000107
>>> sdhci_bcm0-slot0: Timeout:  0x00000003 | Int stat: 0x00000000
>>> sdhci_bcm0-slot0: Int enab: 0x01ff003b | Sig enab: 0x01ff003b
>>> sdhci_bcm0-slot0: AC12 err: 0x00000000 | Host ctl2:0x00000000
>>> sdhci_bcm0-slot0: Caps:     0x45ee6432 | Caps2:    0x0000a525
>>> sdhci_bcm0-slot0: Max curr: 0x00080008 | ADMA err: 0x00000000
>>> sdhci_bcm0-slot0: ADMA addr:0x00000000 | Slot int: 0x00000000
>>> sdhci_bcm0-slot0: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=

>>> mmcsd0: Error indicated: 1 Timeout
>>> mmcsd0: Error indicated: 1 Timeout
>>> mmcsd0: Error indicated: 1 Timeout
>>> mmcsd0: Error indicated: 1 Timeout
>>> mmcsd0: Error indicated: 1 Timeout
>>>=20
>>>=20
>>> After one "Root mount waiting for: CAM"
>>> that both have, the failing side has:
>>>=20
>>> Mounting from ufs:/dev/label/RPi4root failed with error 2; retrying =
for 10 more seconds
>>> Mounting from ufs:/dev/label/RPi4root failed with error 2.
>>>=20
>>> Loader variables:
>>> vfs.root.mountfrom=3Dufs:/dev/label/RPi4root
>>> vfs.root.mountfrom.options=3Drw,noatime
>>>=20
>>> Manual root filesystem specification:
>>> <fstype>:<device> [options]
>>>     Mount <device> using filesystem <fstype>
>>>     and with the specified (optional) option list.
>>>=20
>>>   eg. ufs:/dev/da0s1a
>>>       zfs:zroot/ROOT/default
>>>       cd9660:/dev/cd0 ro
>>>         (which is equivalent to: mount -t cd9660 -o ro /dev/cd0 /)
>>>=20
>>> ?               List valid disk boot devices
>>> .               Yield 1 second (for background tasks)
>>> <empty line>    Abort manual input
>>>=20
>>> mountroot> ?
>>>=20
>>> List of GEOM managed disk devices:
>>> mmcsd0
>>>=20
>>> mountroot>=20
>>>=20
>>>=20
>>> As for the "it boots" side of the comparison:
>>>=20
>>> Root mount waiting for: CAM
>>> Root mount waiting for: CAM
>>> Root mount waiting for: CAM
>>> Root mount waiting for: CAM
>>> Root mount waiting for: CAM
>>> Root mount waiting for: CAM
>>> Root mount waiting for: CAM
>>> Root mount waiting for: CAM
>>> Root mount waiting for: CAM
>>> Warning: no time-of-day clock registered, system time will not be =
set accurately
>>> Setting hostuuid: a4f7fbeb-f668-11de-b280-ebb65474e619.
>>> Setting hostid: 0xcd8e9e25.
>>> Starting file system checks:
>>> /dev/label/RPi4root: FILE SYSTEM CLEAN; SKIPPING CHECKS
>>> /dev/label/RPi4root: clean, 19046293 free (498933 frags, 2318420 =
blocks, 1.8% fragmentation)
>>>=20
>>> (And so on.)
>>>=20
>>>=20
>>>=20
>>> By contrast, the failing boot -v
>>> comparison goes like (not much is
>>> different between the two boot -v
>>> instances) . . .
>>>=20
>>>=20
>>> The working one had a 3rd REGISTER DUMP
>>> before the mmc0 bus width notice that the
>>> failing one did not have:
>>>=20
>>> sdhci_bcm1-slot0: Got command interrupt 0x00030000, but there is no =
active command.
>>> sdhci_bcm1-slot0: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =
REGISTER DUMP =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>>> sdhci_bcm1-slot0: Sys addr: 0x00000000 | Version:  0x00009902
>>> sdhci_bcm1-slot0: Blk size: 0x00000000 | Blk cnt:  0x00000000
>>> sdhci_bcm1-slot0: Argument: 0x000001aa | Trn mode: 0x00000000
>>> sdhci_bcm1-slot0: Present:  0x000f0000 | Host ctl: 0x00000001
>>> sdhci_bcm1-slot0: Power:    0x0000000f | Blk gap:  0x00000000
>>> sdhci_bcm1-slot0: Wake-up:  0x00000000 | Clock:    0x0000fa07
>>> sdhci_bcm1-slot0: Timeout:  0x00000000 | Int stat: 0x00000000
>>> sdhci_bcm1-slot0: Int enab: 0x01ff00bb | Sig enab: 0x01ff00bb
>>> sdhci_bcm1-slot0: AC12 err: 0x00000000 | Host ctl2:0x00000000
>>> sdhci_bcm1-slot0: Caps:     0x00000000 | Caps2:    0x00000000
>>> sdhci_bcm1-slot0: Max curr: 0x00000001 | ADMA err: 0x00000000
>>> sdhci_bcm1-slot0: ADMA addr:0x00000000 | Slot int: 0x00000000
>>> sdhci_bcm1-slot0: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=

>>>=20
>>> where both then had:
>>>=20
>>> mmc0: setting bus width to 4 bits high speed timing
>>>=20
>>>=20
>>> The failing boot -v ended with:
>>>=20
>>> CPU  1: ARM Cortex-A72 r0p3 affinity:  1
>>> CPU  2: ARM Cortex-A72 r0p3 affinity:  2
>>> CPU  3: ARM Cortex-A72 r0p3 affinity:  3
>>> regulator: shutting down unused regulators
>>> sdhci_bcm0-slot0: Controller timeout
>>> sdhci_bcm0-slot0: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =
REGISTER DUMP =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>>> sdhci_bcm0-slot0: Sys addr: 0x000006c8 | Version:  0x00001002
>>> sdhci_bcm0-slot0: Blk size: 0x00000200 | Blk cnt:  0x00000001
>>> sdhci_bcm0-slot0: Argument: 0x0ee2afc1 | Trn mode: 0x00000036
>>> sdhci_bcm0-slot0: Present:  0x1fff0a06 | Host ctl: 0x00000007
>>> sdhci_bcm0-slot0: Power:    0x0000000f | Blk gap:  0x00000080
>>> sdhci_bcm0-slot0: Wake-up:  0x00000000 | Clock:    0x00000107
>>> sdhci_bcm0-slot0: Timeout:  0x00000003 | Int stat: 0x00000000
>>> sdhci_bcm0-slot0: Int enab: 0x01ff003b | Sig enab: 0x01ff003b
>>> sdhci_bcm0-slot0: AC12 err: 0x00000000 | Host ctl2:0x00000000
>>> sdhci_bcm0-slot0: Caps:     0x45ee6432 | Caps2:    0x0000a525
>>> sdhci_bcm0-slot0: Max curr: 0x00080008 | ADMA err: 0x00000000
>>> sdhci_bcm0-slot0: ADMA addr:0x00000000 | Slot int: 0x00000000
>>> sdhci_bcm0-slot0: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=

>>> mmcsd0: Error indicated: 1 Timeout
>>> mmcsd0: Error indicated: 1 Timeout
>>> mmcsd0: Error indicated: 1 Timeout
>>> mmcsd0: Error indicated: 1 Timeout
>>> mmcsd0: Error indicated: 1 Timeout
>>> Mounting from ufs:/dev/label/RPi4root failed with error 2; retrying =
for 10 more seconds
>>> Mounting from ufs:/dev/label/RPi4root failed with error 2; retrying =
for 9 more seconds
>>> Mounting from ufs:/dev/label/RPi4root failed with error 2; retrying =
for 8 more seconds
>>> Mounting from ufs:/dev/label/RPi4root failed with error 2; retrying =
for 7 more seconds
>>> Mounting from ufs:/dev/label/RPi4root failed with error 2; retrying =
for 6 more seconds
>>> Mounting from ufs:/dev/label/RPi4root failed with error 2; retrying =
for 5 more seconds
>>> Mounting from ufs:/dev/label/RPi4root failed with error 2; retrying =
for 4 more seconds
>>> Mounting from ufs:/dev/label/RPi4root failed with error 2; retrying =
for 3 more seconds
>>> Mounting from ufs:/dev/label/RPi4root failed with error 2; retrying =
for 2 more seconds
>>> Mounting from ufs:/dev/label/RPi4root failed with error 2; retrying =
for 1 more second
>>> Mounting from ufs:/dev/label/RPi4root failed with error 2.
>>>=20
>>> Loader variables:
>>> vfs.root.mountfrom=3Dufs:/dev/label/RPi4root
>>> vfs.root.mountfrom.options=3Drw,noatime
>>>=20
>>> Manual root filesystem specification:
>>> <fstype>:<device> [options]
>>>     Mount <device> using filesystem <fstype>
>>>     and with the specified (optional) option list.
>>>=20
>>>   eg. ufs:/dev/da0s1a
>>>       zfs:zroot/ROOT/default
>>>       cd9660:/dev/cd0 ro
>>>         (which is equivalent to: mount -t cd9660 -o ro /dev/cd0 /)
>>>=20
>>> ?               List valid disk boot devices
>>> .               Yield 1 second (for background tasks)
>>> <empty line>    Abort manual input
>>>=20
>>> mountroot>=20
>>>=20
>>>=20
>>> The working boot -v instead had for that last area of the
>>> above output:
>>>=20
>>> CPU  1: ARM Cortex-A72 r0p3 affinity:  1
>>> CPU  2: ARM Cortex-A72 r0p3 affinity:  2
>>> CPU  3: ARM Cortex-A72 r0p3 affinity:  3
>>> Root mount waiting for:regulator: shutting down unused regulators
>>> CAM
>>> Root mount waiting for: CAM
>>> Root mount waiting for: CAM
>>> Root mount waiting for: CAM
>>> Root mount waiting for: CAM
>>> Root mount waiting for: CAM
>>> Root mount waiting for: CAM
>>> Root mount waiting for: CAM
>>> Root mount waiting for: CAM
>>> Root mount waiting for: CAM
>>> Warning: no time-of-day clock registered, system time will not be =
set accurately
>>> start_init: trying /sbin/init
>>> Setting hostuuid: a4f7fbeb-f668-11de-b280-ebb65474e619.
>>> Setting hostid: 0xcd8e9e25.
>>> Starting file system checks:
>>> /dev/label/RPi4root: FILE SYSTEM CLEAN; SKIPPING CHECKS
>>> /dev/label/RPi4root: clean, 19057570 free (498234 frags, 2319917 =
blocks, 1.8% fragmentation)
>>>=20
>>> (I omit the rest.)
>>>=20
>>>=20
>>> That is it for output differences for
>>> boot -v.
>>>=20
>>>=20
>>> I'll note that "shutdown -r now" does not reboot
>>> but just stops after the "Uptime:" message line.
>>> I do not expect that this is new.
>>=20
>>=20
>>=20
>> =3D=3D=3D
>> Mark Millard
>> marklmi at yahoo.com
>> ( dsl-only.net went
>> away in early 2018-Mar)
>>=20
>> _______________________________________________
>> freebsd-arm@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-arm
>> To unsubscribe, send any mail to =
"freebsd-arm-unsubscribe@freebsd.org"
>=20
> This is probably RPi4 specific only. I have just upgraded Pine64LTS to
> recent CURRENT and rebooted without issues.
>=20

Actually, I've now also reported boot problems on
an OrgangePi+ 2e (armv7) and did a kernel bisect that
showed the OPi+2e problem starting at head -r359311
(and continuing on after that). ( -r359309 and before
boots the OPi+2e just fine. )

Head -r359311 was an attempt to move various atomics
into the "MI-namesace". It suggests that at least one
atomic (or its use) is now broken on armv7. There
might be more or some non-armv7 examples as well.

With any broken atomics, I'd not necessarily expect
a detailed behavioral match across the likes of
OPi+2e vs. RPi4. The RPi4 might have hit such an
issue. But I've not substituted artifact.freebsd.org
kernels to test such (yet?) on the RPi4. The RPi4
is far from complete and not expected to be
generally operational at this time. It does not
make for a great debugging context as stands.


I've not had such problems for the Rock64 or RPi3
that had the intended version upgrade. But it could
also be that whatever breakage for either of them
is just not as obvious.

So, now I've got to figure out what I'm going to do
for the other machines that were to have matching
version upgrades.

=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?512454AB-47CC-49EC-9E67-1111C7AAC471>