Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 12 Oct 2014 17:53:49 -0700
From:      Mark Millard <markmi@dsl-only.net>
To:        FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>, Nathan Whitehorn <nwhitehorn@freebsd.org>
Cc:        Justin Hibbits <chmeeedalf@gmail.com>
Subject:   My PowerMac G5's no longer crash at boot: PowerMac G5 specific ofwcall changes with justifying evidence
Message-ID:  <76F704FD-BB74-4439-8318-DB4C167B420F@dsl-only.net>

next in thread | raw e-mail | index | archive | help
NOTE: I make no claim that any of the below hacks for ofwcall are =
appropriate code for FreeBSD's general context. I only claim that it =
seems to make the specific PowerMac G5 problem go away, gives solid =
evidence for at least some of what is going on (justifying the =
investigative and testing hacks) and so gives evidence for an =
appropriate, more general FreeBSD solution.


The big issue is: The PowerMac G5 openfirmware does not always preserve =
the %r1 value (the stack pointer contents) that it is initially given, =
at least when the early "before copyright" crash problem is happening =
but possibly other times as well.

I had the following investigative code in ofwcall, snapshotting the =
value of %r1 before and after openfirmware's code is used:

 	lis	%r4,openfirmware_entry@ha
 	ld	%r4,openfirmware_entry@l(%r4)
...
 	mr   %r17,%r1 /* ADDED HACK TO RECORD %r1 before...
 	/* Finally, branch to OF */
 	mtctr	%r4
 	bctrl
 	mr   %r18,%r1 /* ADDED HACK TO RECORD %r1 after...

then the DDB show registers from the crash that I'd hacked in would show =
these values instead of the zeros they otherwise always display, in =
addition to what the show registers has always shown for r1.

The results were like the following example for every such crash:

r17 =3D 0xC31400 ofwstk+0xfe0
r18 =3D 0xd24450
r1  =3D 0xd24450

Because of that %r1 value the later code such as:

 	/* Reload stack pointer and MSR from the OFW stack */
 	ld	%r6,24(%r1)
 	ld	%r2,16(%r1)
 	ld	%r1,8(%r1)

gets garbage-in/garbage-out results, including %r6 being values like =
0xbc0568 instead of the value saved msr to later be restored: =
0x9000000000001032.

So one PowerMac G5 specific hack involved in my working-boots context is =
to force the original %r1 value to be used (based on %r17 being a =
before-call copy, similar to the above):

 	ld	%r6,24(%r17)
 	ld	%r2,16(%r17)
 	ld	%r1,8(%r17)

But the exception report from DDB has had problems in part because sprg0 =
still has the openfirmware value at the time even though the exception =
is after openfirmware returned (the wrong value results in the register =
for GET_CPUINFO(<register>). So I hacked in a before-exception restore =
of FreeBSD's sprg0 inside ofwcall to make the exception handler code =
have that much FreeBSD context available at the exception (if it occurs, =
anyway). This was really just to help with information gathering, =
although I've not tested only having the %r17 changes.

So overall PowerMac G5 specific hacking the ofwcall code to have instead =
(based on what was reported above):

root@FBSDG5M1:~ # svnlite diff /usr/src/sys/powerpc/ofw/ofwcall64.S
Index: /usr/src/sys/powerpc/ofw/ofwcall64.S
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- /usr/src/sys/powerpc/ofw/ofwcall64.S	(revision 272558)
+++ /usr/src/sys/powerpc/ofw/ofwcall64.S	(working copy)
@@ -52,6 +52,12 @@
 GLOBAL(rtas_entry)
 	.llong	0			/* RTAS entry point */
=20
+ /* HACK: part of having sprg0 in place for trap */
+ofwsprg0save:
+	.space	8 /* sizeof(register_t) */
+GLOBAL(ofw_sprg0_save)
+	.llong	0
+
 /*
  * Open Firmware Real-mode Entry Point. This is a huge pain.
  */
@@ -97,6 +103,10 @@
 	lis	%r4,openfirmware_entry@ha
 	ld	%r4,openfirmware_entry@l(%r4)
=20
+	/* HACK: part of having FreeBSD's sprg0 in place for the =
exception problem */
+	lis	%r14,ofw_sprg0_save@ha
+	ld	%r14,ofw_sprg0_save@l(%r14)
+
 	/*
 	 * Set the MSR to the OF value. This has the side effect of =
disabling
 	 * exceptions, which is important for the next few steps.
@@ -123,14 +133,27 @@
 	stw	%r5,4(%r1)
 	stw	%r5,0(%r1)
=20
+	/* HACK: part of having FreeBSD's sprg0 in place for the =
exception problem */
+	lis	%r6,ofwsprg0save@ha
+	std	%r14,ofwsprg0save@l(%r6)
+
+	/* HACK: part of IGNORING the later %r1 value from openfirmware =
*/
+	mr	%r17,%r1
+
 	/* Finally, branch to OF */
 	mtctr	%r4
 	bctrl
=20
+	/* HACK: part of having FreeBSD's sprg0 in place for the =
exception problem */
+	lis	%r6,ofwsprg0save@ha
+	ld	%r6,ofwsprg0save@l(%r6)
+	mtsprg0	%r6
+
 	/* Reload stack pointer and MSR from the OFW stack */
-	ld	%r6,24(%r1)
-	ld	%r2,16(%r1)
-	ld	%r1,8(%r1)
+	/* HACKED to ignore the %r1 value that results from =
openfirmware's call */
+	ld	%r6,24(%r17)
+	ld	%r2,16(%r17)
+	ld	%r1,8(%r17)
=20
 	/* Now set the real MSR */
 	mtmsrd	%r6

This results in no crashes happening so far in my testing, not even the =
16 GByte RAM machine that crashed so much.

NOTE: owf_machdep.c was changed to use "extern register_t =
ofw_sprg0_save;" to match the above.

I still have ps3 disabled in GENERIC64 so that I can also have the sc =
options in GENERIC64. And the DDB and GDB options are still present as =
well.

And I still have my hack to force a DDB script that does show registers =
and shows the ofwcall history information that I hacked in, even for the =
very early crashes before input is possible. Not that I'm now getting =
such executions of the script. (A before possible-crash backtrace is =
also shown by the added code. That still shows up.)

I'll probably next switch to reverting the DDB related code changes and =
to removing the DDB/GDB options and see how that goes.


=3D=3D=3D
Mark Millard
markmi at dsl-only.net




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?76F704FD-BB74-4439-8318-DB4C167B420F>