Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 29 Sep 2021 19:27:53 +0200
From:      Emmanuel Vadot <manu@bidouilliste.com>
To:        Andriy Gapon <avg@FreeBSD.org>
Cc:        "freebsd-arm@freebsd.org" <arm@FreeBSD.org>
Subject:   Re: rock64 verbose boot hangs
Message-ID:  <20210929192753.449ad9a061366ea5e19d735e@bidouilliste.com>
In-Reply-To: <4d24bb8a-0ffe-9073-7863-e83025ffc4fa@FreeBSD.org>
References:  <fffc32e5-9f0d-3ead-12da-4a6d108a6154@FreeBSD.org> <20210920190213.5839f18816daf1f6e4289b94@bidouilliste.com> <d854481f-a322-a37b-a80c-aaf291f9efdc@FreeBSD.org> <4d24bb8a-0ffe-9073-7863-e83025ffc4fa@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 29 Sep 2021 20:07:25 +0300
Andriy Gapon <avg@FreeBSD.org> wrote:

> On 23/09/2021 20:46, Andriy Gapon wrote:
> > On 20/09/2021 20:02, Emmanuel Vadot wrote:
> >>
> >> =A0 Hi Andriy,
> >>
> >> On Sat, 18 Sep 2021 15:58:00 +0300
> >> Andriy Gapon <avg@FreeBSD.org> wrote:
> >>
> >>>
> >>> Normal boot works every time, but with boot_verbose=3D"YES" it hanged=
 on all
> >>> attempts so far.
> >>>
> >>> Last messages on the console:
> >>> cpulist0: <Open Firmware CPU Group> on ofwbus0
> >>> cpu0: <Open Firmware CPU> on cpulist0
> >>> cpu0: Nominal frequency 600Mhz
> >>> cpufreq_dt0: <Generic cpufreq driver> on cpu0
> >>> cpufreq_dt0: 408.000 Mhz (950000 uV)
> >>> cpufreq_dt0: 600.000 Mhz (950000 uV)
> >>> cpufreq_dt0: 816.000 Mhz (1000000 uV)
> >>> cpufreq_dt0: 1008.000 Mhz (1100000 uV)
> >>> cpufreq_dt0: 1200.000 Mhz (1225000 uV)
> >>> cpufreq_dt0: 1296.000 Mhz (1300000 uV)
> >>> cpu1: <Open Firmware CPU> on cpulist0
> >>> cpu1: Nominal frequency 600Mhz
> >>> cpufreq_dt1: <Generic cpufreq driver> on cpu1
> >>>
> >>> The kernel is totally unresponsive after that.
> >>
> >> =A0 Can't reproduce here, I'm running 548a706608d with latest DTB and
> >> latest u-boot/atf
> >>
> >>> Any suggestions on how to debug this?
> >>
> >> =A0 Not really sure how to start, that seems weird that the kernel will
> >> hang at the cpufreq attach but maybe try modifying the DTB to remove
> >> this node ?
> >> =A0 Also did that happens with my recent commit on clock or was this t=
he
> >> same before ?
>=20
> An update relevant to the question above.
> Actually, after upgrading to a version that includes your clock changes t=
he=20
> problem went away!
> I don't know what to make out of this fact, but it looks like the problem=
 was a=20
> clock plus timing issue.

 I'm not that surprised. Before my clock changes netboot always failed
in a really weird way where AP couldn't be started and the serial
output was switching chars around (Like "cuolt'd rsart AP").
 So I'm glad that it fixed your problems because I had really no idea
how to debug that :P

> > Thank you and every one else who responded with information and suggest=
ions.
> >=20
> > Some extra details.
> > I've been having this problem since I've got this board 9 months ago.
> > It's been through several FreeBSD and U-Boot and stuff in the ESP parti=
tion=20
> > upgrades.=A0 And the problem was always present.
> >=20
> > Now I've done more extensive testing with a couple of dozen reboots in =
a row and=20
> > some additional debug prints (like, for example, DEBUG in subr_bus.c).
> >=20
> > I actually see several variations of the problem.
> > Sometimes it's a hang, but sometimes it's a crash.
> > A hang can happen in different places and a crash can happen in differe=
nt places=20
> > too.
> > Some crashes happens during AP startup and the information I am getting=
 is not=20
> > very usable.
> > Some crashes happen during a driver probing when the bus code searches =
the hints=20
> > memory space.=A0 Those crashes look like a memory corruption happens th=
ere at random.
> >=20
> > Given those variations plus some other differences that I have comparin=
g to=20
> > other Rock64 users (like needing special setup for eMMC and for the wat=
chdog), I=20
> > am inclined to think that the board I have has something special either=
 in the=20
> > hardware (like a different configuration via some fuses) or in the Boot=
ROM.
> > Even though the PCB has the standard markings.
> >=20
> > And I would not be surprised about that (that it could be a customized=
=20
> > production) as I got my Rock64-s via a special / unusual deal on Amazon=
.=20
> > Iconikal and Recon Sentinal are keywords to search for, for those inter=
ested.
> > Some news articles from the time:
> > https://liliputing.com/2020/09/this-10-single-board-computer-is-faster-=
than-a-raspberry-pi-3.html=20
> >=20
> > https://www.tomshardware.com/news/raspberry-pi-sized-iconikal-rockchip-=
sbc-only-dollar8-on-amazon=20
> >=20
> >=20
> > So, in the end, I still do not know what causes the verbose boot to han=
g / crash.
> > Maybe there is some (not fully working) watchdog that gets armed and di=
sarmed by=20
> > some hardware accesses and the verbose boot is too slow to complete in =
time.
> >=20
> > Here is a small subset of panics and hangs that I saw:
> > https://people.freebsd.org/~avg/rock64-verbose-boot-panic.txt
> >=20
>=20
>=20
> --=20
> Andriy Gapon
>=20


--=20
Emmanuel Vadot <manu@bidouilliste.com> <manu@FreeBSD.org>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20210929192753.449ad9a061366ea5e19d735e>