Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 14 Oct 2014 15:18:55 -0700
From:      Mark Millard <markmi@dsl-only.net>
To:        Nathan Whitehorn <nwhitehorn@freebsd.org>
Cc:        Justin Hibbits <chmeeedalf@gmail.com>, FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>
Subject:   Re: My PowerMac G5's no longer crash at boot: PowerMac G5 specific ofwcall changes with justifying evidence [important typos fixed]
Message-ID:  <3D4A76B3-431A-4C94-8747-70369A8A1764@dsl-only.net>
In-Reply-To: <543D5ACD.20901@freebsd.org>
References:  <76F704FD-BB74-4439-8318-DB4C167B420F@dsl-only.net>	<543B3828.8070806@freebsd.org>	<9D9B0372-8D8F-4153-85B5-40066206EF67@dsl-only.net>	<379AA7FC-98C9-48B9-92BB-60E134817AF1@dsl-only.net>	<C614025F-6455-4929-8468-462E76079274@dsl-only.net>	<A2AB9066-259B-4B7D-BDDC-D03AE5827E13@dsl-only.net> <CAHSQbTCKi_MBhERh6d=kX2y-=%2B2OzqpGM%2BN=ZEShi-kX2r8NPQ@mail.gmail.com> <543D5ACD.20901@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
For openfirmware: is %r3 on return any more then a failed vs. not flag =
with a particular failed-value? Is there any way to validate that %r3 =
values for non-failure look reasonable vs. not looking reasonable? (For =
all I know %r3 could also be corrupt.)

I do not have any documentation for the PowerMac G5 openfirmware API =
that is in use or the associated ABI as far as I remember. I do not know =
if it strictly followed Darwin's/Mac OS X's ABI on PowerMac G5's vs. if =
there was some conversion going back and forth (as there is for FreeBSD, =
at least for powerpc64). For openfirmware I derive properties from what =
I see in FreeBSD's code (which has to be more explicit then when a =
compiler's code generation happens to match at least large parts of an =
ABI directly).

As I vaguely-remember Apple did not use the TOC for Darwin's/Mac OS X's =
ABI but FreeBSD does. If true I do not know what other differences that =
there might be (even ignoring the 32 bit vs. 64 bit issues for the =
kernels). But the point would be an existence proof of at least one =
difference. My understanding is that %r1 was as in FreeBSD.

I vaguely seem to remember that for Darwin/Mac OS X some register was =
volatile in leaf functions but non-volatile otherwise, or at least when =
nested functions were involved. And that brings to mind that the =
condition code sets in cr might have had a mix of volatile and =
non-volatile status despite being in one register? Did Darwin/Mac OS X =
have something special for register usage for Thread-Specific Storage? =
Position Independent Code? Indirect Calls? Frame Pointers?

I may have some Darwin/Mac OS X information around but I doubt that it =
is complete, especially for the 64-bit ABI or for privileged contexts. =
For the 32-bit ABI (non-priviledged) I likely have the information about =
the above possible ABI properties.

I assume that openfirmware avoids the FPU and other such --but I do not =
know. But it is privileged code.

Are there any known sources of at least some of the information for the =
the PowerMac G5 openfirmware ABI(s)? What are good references for the =
FreeBSD PowerPC ABI(s) (32 bit and 64 bit, privileged vs. not)?

[I cut off some of the older history.]

=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On Oct 14, 2014, at 10:18 AM, Nathan Whitehorn <nwhitehorn at =
freebsd.org> wrote:

r1 *must be* preserved by the standard and for anything to work. It's =
being corrupted somehow (Mark's comment about r3 is illuminating), and =
if r1 is being corrupted, you can't rely on anything. I suspect it might =
be an exception handling issue since it's non-deterministic, but it's =
hard to tell. It could also be triggered by the way we've set up the OF =
stack frame. It would be good to check if that makes sense.
-Nathan

On 10/14/14 09:53, Justin Hibbits wrote:
> Interesting.  Perhaps, instead of using %r1, and relying purely on the
> stack we use yet another (non-volatile) register to hold the MSR.
> Once we reload the MSR we can get back the saved registers, because
> the stack will be valid again.
>=20
> Nathan, thoughts?
>=20
> - Justin
>=20
> On Tue, Oct 14, 2014 at 9:14 AM, Mark Millard <markmi at dsl-only.net> =
wrote:
>> Additional notes from additional experiments... (So far from one G5.)
>>=20
>> I got back trace, show registers, and my openfirmware-history list =
going for failure reporting based on explicit before vs. after tests of =
%r1 values. (Explicit breakpoint call for unequal, being careful to =
save/restore %r3 around the call.) I filled several registers with =
potentially interesting values that would otherwise have had zero as a =
value (%r15-%r19, although %r15 is redundant with %r6 currently).
>>=20
>> An interesting property resulted: every time %r1 had changed from =
having the before-value (stack pointer value) %r1 instead ended up with =
a value equal to what openfirmware put in %r3.
>>=20
>> And more then that: For builds with the same ofwstk position the %r3 =
value involved was fixed for the failures, for example when =
0x30400=3Dofwstk+0xfe0 (%r1 before) was reported %r3 and %r1 end up as =
0xd23450 for the failures. When 0x31400=3Dofwstk+0xfe0: %r3 and %r1 =
ended up for failure as 0xd24450 instead. Yep: offset by the same amount =
as ofwstk.
>>=20
>> And I got one example where the openfirmware %r1-value-change failure =
was instead much later in the boot, well after pmap_bootstrapped went =
true: It was just after the message lines...
>>=20
>> vgapci0: Boot video device ...
>> pcib1: <IBM CPC9X5 Hypertransport tunnel> ...
>>=20
>> with back trace (from OF_peer down):
>>=20
>> .OF_peer+0x8c
>> .cpcht_attach+0x884
>> .device_attach+0x3ac
>> .device_probe_and_attach+0x3c
>> .bus_generic_new_pass+0x12c
>> .bus_generic_new_pass+0x114
>> .bus_generic_new_pass+0x114 (yep: listed twice)
>> .bus_set_pass+0xc0
>> .root_bus_configure+0x14
>> .mi_startup+0x10c
>> btext+0xbc
>>=20
>> %r1 before: 0xc30400 ofwstk+0xfe0
>> %r1 after:  0xd23450
>> %r3 after:  0xd23450
>> FreeBSD msr to restore: 0x9000000000001032
>> ofmsr[0]  to restore:   0x1000000000003030
>>=20
>> The same after-openfirmware %r1 and %r3 values that had been showing =
up for the before-copyright examples of ofwcall failures.
>>=20
>> And note that it again was a peer request. All the ofwcall-tied =
boot-failures have been for peer requests as far as I remember.
>>=20
>> I later did some experiments where I had it report but not stop when =
the after-value was different from the before-value for %r1. When this =
happened for these types of tests it seem to be an isolated example: =
later calls normally have the stack pointer value still in %r1 after =
openfirmware returns. In more detail: At most one report was made for =
such a boot, the rest of the boot went fine. (Of course to get that far =
my hacked ofwcall code avoids using the after-openfirmware %r1 value to =
extract the 3 saved values to be restored from the bottom of ofwstk.)
>>=20
>>=20
>>=20
>> I was not successful at using "capture on" in DDB for this early-boot =
context. (It hangs things after the first report.) So I've been limited =
to one screen's report and only when I have it stop at the end of the =
report (so it does not scroll away). (No input to DDB available that =
early.) Otherwise the information just scrolls by rather quickly for =
reading any detail. Still it was useful to see that other reports were =
not produced after the first (when there was a first). (I can not claim =
multiple are impossible. It just appears at least infrequent.)
>>=20
>> I have not yet investigated making analogous powerpc/GENERIC code and =
builds.
>>=20
>> Nor have I dealt with having it report more detail about the peer =
requests that fail.
>>=20
>> Nor have I seen examples of what "not failing/%r1-unchanged" looks =
like overall.
>>=20
>> I still have no examples of unstable/incomplete initialization(s) or =
race condition(s) to explain why both ways can and do occur from one =
attempt to the next --or that difference peer requests in the sequence =
can be where the problem happens.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3D4A76B3-431A-4C94-8747-70369A8A1764>