Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 9 Sep 2016 13:21:31 -0700
From:      Mark Millard <markmi@dsl-only.net>
To:        Krzysztof Parzyszek <kristof@swissmail.org>
Cc:        Jukka Ukkonen <jau789@gmail.com>, freebsd-ppc@freebsd.org
Subject:   Re: PowerMac G5 hangs/crashes on boot: 10.2, 11.0-RCx
Message-ID:  <0A9EB3C7-F430-4F82-9B09-632754BB82C8@dsl-only.net>
In-Reply-To: <db0aa91b-aa79-689a-e901-437e18b49b81@swissmail.org>
References:  <6ad00a2d-4213-18b8-7974-534aa3758837@swissmail.org> <E90BB066-47C9-4626-BE6C-5D15ECA0E4EE@gmail.com> <db0aa91b-aa79-689a-e901-437e18b49b81@swissmail.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On 2016-Sep-9, at 11:36 AM, Krzysztof Parzyszek <kristof T =
swissmail.org> wrote:
>=20
> On 9/9/2016 6:35 AM, Jukka Ukkonen wrote:
>>=20
>> The story apparently goes such that the interrupt code shown can be
>> pretty much anything. The interrupts might simply be enabled way =
before
>> the system is ready to handle them.
>=20
> I've had similar issues for quite some time.  Previous releases would =
boot only sometimes, otherwise I'd be getting a hang or a crash.  The =
frequency of the boot problems seems to increase dramatically when I =
boot from the hard-drive, but with 11 it has never booted correctly.
>=20
> I wasn't the only one seeing this type of a problem and I remember =
seeing a thread about it a while back.  Mark Millard reported it, and =
someone has tracked it down to some register getting (unexpectedly) =
clobbered by the open firmware.  I was hoping this had been fixed, but =
it seems that things have only gotten worse...  :(
>=20
> CCing Mark---maybe he will know more about this.
>=20
> -Krzysztof

Unfortunately relative to powerpc and powerpc64: I've not had powerpc or =
powerpc64 access since very early 2016-June and will not for a few more =
weeks. (And, yes, the context is PowerMac's specifically.)

So I've done no testing of if my personal kernel hack (that made the =
PowerMac G5's boot reliably in my use) helps in any more modern FreeBSD =
variants. It is unlikely that I'll get to that point before October =
sometime. Until then I'll not be much direct help.

I'm the one that isolated memory and register corruption examples on =
PowerMac G5's before identifying my specific hack that I used to avoid =
them.

Beyond my reporting the hack in the lists I did submit a bugzilla report =
documenting what change made the observed difference in boot reliability =
(in the older context, anyway):

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D205458 (from =
2015-Dec-20)

It reports as the technique:

> The change is in ofw_sprg_prepare of sys/powerpc/ofw/ofw_machdep.c and =
could look something like (presented in a form to show =
new/PowerMacG5-Specific code and old general code):
>=20
> #ifdef POWERMAC_G5_SPECIFIC_BUILD
> 	__asm __volatile("mfsprg0 %0\n\t"
> 			 "mtsprg1 %1\n\t"
> 			 "mtsprg2 %2\n\t"
> 			 "mtsprg3 %3\n\t"
> 			 : "=3D&r"(ofw_sprg0_save)
> 			 : "r"(ofmsr[2]),
> 			 "r"(ofmsr[3]),
> 			 "r"(ofmsr[4]));
> #else
> // The historical code:
> 	__asm __volatile("mfsprg0 %0\n\t"
> 			 "mtsprg0 %1\n\t"
> 			 "mtsprg1 %2\n\t"
> 			 "mtsprg2 %3\n\t"
> 			 "mtsprg3 %4\n\t"
> 			 : "=3D&r"(ofw_sprg0_save)
> 			 : "r"(ofmsr[1]),
> 			 "r"(ofmsr[2]),
> 			 "r"(ofmsr[3]),
> 			 "r"(ofmsr[4]));
> #endif
>=20
> In other words: for PowerMac G5's omit the mtsprg0 from ofmsr[1]: =
leave the register as it already is instead of resetting it. The value =
in ofmsr[1] is inappropriate to the context. I deliberately kept the =
change minimal and left in all other code related to the register.

All the evidence for this hack is observational. I've never figured out =
a reasonable way to find out what Apple's openfirmware does with the =
register involved and in what contexts. I wish I had better evidence for =
what is going on without the hack. The type of evidence that I have =
makes this purely a hack for now, even if it has a theory of operation =
justification (that is not known yet).

But as for the degree of observations: in isolating this I did well over =
10,000 failing boots (spread over months, although not continuous =
activity). Frequently I'd have to try booting over a dozen times in a =
row before it would make it through. That is part of why the total is so =
large. After the hack I've not had any such failing boots up --but I =
boot far less frequently since I do not need to force a reboot. (I =
always buildworld buildkernel from source and my source has the hack.)

I've no post-early-2016-June evidence relative to the hack.

The lists have more information from as I investigated the issue, such =
as the memory and register corruptions that I observed prior to =
isolating the small change. But it is a mess to go through those notes =
in any detail. Not likely without a strong motivation.

I've no evidence that the change would be appropriate outside a PowerMac =
G5 at all. This alone would keep FreeBSD from adopting it in a generic =
build (even if there was a PowerMac G5 theory of operation justification =
known). The submittal only suggested having a pre-made hook for manually =
building from source for a PowerMac G5.

Part of the issue is that I do not know a way to identify the context as =
a PowerMac G5 context without use of openfirmware. Any use of =
openfirmware to figure that out would re-create the problem as far as I =
can tell. It appears that the build needs to be PowerMac G5 specific to =
avoid the problem.

I will note that I've never needed or used the hack on Powermac G4's or =
a PowerMac G3. But, again, my evidence ends in early-2016-June.

=3D=3D=3D
Mark Millard
markmi at dsl-only.net




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0A9EB3C7-F430-4F82-9B09-632754BB82C8>