Date: Sun, 12 Oct 2014 17:53:49 -0700 From: Mark Millard <markmi@dsl-only.net> To: FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>, Nathan Whitehorn <nwhitehorn@freebsd.org> Cc: Justin Hibbits <chmeeedalf@gmail.com> Subject: My PowerMac G5's no longer crash at boot: PowerMac G5 specific ofwcall changes with justifying evidence Message-ID: <76F704FD-BB74-4439-8318-DB4C167B420F@dsl-only.net>
next in thread | raw e-mail | index | archive | help
NOTE: I make no claim that any of the below hacks for ofwcall are = appropriate code for FreeBSD's general context. I only claim that it = seems to make the specific PowerMac G5 problem go away, gives solid = evidence for at least some of what is going on (justifying the = investigative and testing hacks) and so gives evidence for an = appropriate, more general FreeBSD solution. The big issue is: The PowerMac G5 openfirmware does not always preserve = the %r1 value (the stack pointer contents) that it is initially given, = at least when the early "before copyright" crash problem is happening = but possibly other times as well. I had the following investigative code in ofwcall, snapshotting the = value of %r1 before and after openfirmware's code is used: lis %r4,openfirmware_entry@ha ld %r4,openfirmware_entry@l(%r4) ... mr %r17,%r1 /* ADDED HACK TO RECORD %r1 before... /* Finally, branch to OF */ mtctr %r4 bctrl mr %r18,%r1 /* ADDED HACK TO RECORD %r1 after... then the DDB show registers from the crash that I'd hacked in would show = these values instead of the zeros they otherwise always display, in = addition to what the show registers has always shown for r1. The results were like the following example for every such crash: r17 =3D 0xC31400 ofwstk+0xfe0 r18 =3D 0xd24450 r1 =3D 0xd24450 Because of that %r1 value the later code such as: /* Reload stack pointer and MSR from the OFW stack */ ld %r6,24(%r1) ld %r2,16(%r1) ld %r1,8(%r1) gets garbage-in/garbage-out results, including %r6 being values like = 0xbc0568 instead of the value saved msr to later be restored: = 0x9000000000001032. So one PowerMac G5 specific hack involved in my working-boots context is = to force the original %r1 value to be used (based on %r17 being a = before-call copy, similar to the above): ld %r6,24(%r17) ld %r2,16(%r17) ld %r1,8(%r17) But the exception report from DDB has had problems in part because sprg0 = still has the openfirmware value at the time even though the exception = is after openfirmware returned (the wrong value results in the register = for GET_CPUINFO(<register>). So I hacked in a before-exception restore = of FreeBSD's sprg0 inside ofwcall to make the exception handler code = have that much FreeBSD context available at the exception (if it occurs, = anyway). This was really just to help with information gathering, = although I've not tested only having the %r17 changes. So overall PowerMac G5 specific hacking the ofwcall code to have instead = (based on what was reported above): root@FBSDG5M1:~ # svnlite diff /usr/src/sys/powerpc/ofw/ofwcall64.S Index: /usr/src/sys/powerpc/ofw/ofwcall64.S =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/sys/powerpc/ofw/ofwcall64.S (revision 272558) +++ /usr/src/sys/powerpc/ofw/ofwcall64.S (working copy) @@ -52,6 +52,12 @@ GLOBAL(rtas_entry) .llong 0 /* RTAS entry point */ =20 + /* HACK: part of having sprg0 in place for trap */ +ofwsprg0save: + .space 8 /* sizeof(register_t) */ +GLOBAL(ofw_sprg0_save) + .llong 0 + /* * Open Firmware Real-mode Entry Point. This is a huge pain. */ @@ -97,6 +103,10 @@ lis %r4,openfirmware_entry@ha ld %r4,openfirmware_entry@l(%r4) =20 + /* HACK: part of having FreeBSD's sprg0 in place for the = exception problem */ + lis %r14,ofw_sprg0_save@ha + ld %r14,ofw_sprg0_save@l(%r14) + /* * Set the MSR to the OF value. This has the side effect of = disabling * exceptions, which is important for the next few steps. @@ -123,14 +133,27 @@ stw %r5,4(%r1) stw %r5,0(%r1) =20 + /* HACK: part of having FreeBSD's sprg0 in place for the = exception problem */ + lis %r6,ofwsprg0save@ha + std %r14,ofwsprg0save@l(%r6) + + /* HACK: part of IGNORING the later %r1 value from openfirmware = */ + mr %r17,%r1 + /* Finally, branch to OF */ mtctr %r4 bctrl =20 + /* HACK: part of having FreeBSD's sprg0 in place for the = exception problem */ + lis %r6,ofwsprg0save@ha + ld %r6,ofwsprg0save@l(%r6) + mtsprg0 %r6 + /* Reload stack pointer and MSR from the OFW stack */ - ld %r6,24(%r1) - ld %r2,16(%r1) - ld %r1,8(%r1) + /* HACKED to ignore the %r1 value that results from = openfirmware's call */ + ld %r6,24(%r17) + ld %r2,16(%r17) + ld %r1,8(%r17) =20 /* Now set the real MSR */ mtmsrd %r6 This results in no crashes happening so far in my testing, not even the = 16 GByte RAM machine that crashed so much. NOTE: owf_machdep.c was changed to use "extern register_t = ofw_sprg0_save;" to match the above. I still have ps3 disabled in GENERIC64 so that I can also have the sc = options in GENERIC64. And the DDB and GDB options are still present as = well. And I still have my hack to force a DDB script that does show registers = and shows the ofwcall history information that I hacked in, even for the = very early crashes before input is possible. Not that I'm now getting = such executions of the script. (A before possible-crash backtrace is = also shown by the added code. That still shows up.) I'll probably next switch to reverting the DDB related code changes and = to removing the DDB/GDB options and see how that goes. =3D=3D=3D Mark Millard markmi at dsl-only.net
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?76F704FD-BB74-4439-8318-DB4C167B420F>