Date: Fri, 10 Oct 2014 14:20:26 -0700 From: Mark Millard <markmi@dsl-only.net> To: FreeBSD PowerPC ML <freebsd-ppc@freebsd.org> Cc: Justin Hibbits <chmeeedalf@gmail.com> Subject: A little new before-Copyright-notice/ofwcall crash information... [Still no solution, just more information] Message-ID: <477A81CF-3222-4462-B25D-F46F0AA09D3B@dsl-only.net>
next in thread | raw e-mail | index | archive | help
I was experimenting with trying to get more information on the "before = Copyright notice"/ofwcall PowerMac G5 hangs and accidentally got better = information than I expected. (At least if the "show registers" is to be = believed for SRR0.) First I'll give the results and what they refer to. Then how I got them. As part of the experiments I stuck in isync commands after the ofwcall = to after the mtmsrd just to prove that the same (relative) instruction = position would be reported with or without those: > Index: /usr/src/sys/powerpc/ofw/ofwcall64.S > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/sys/powerpc/ofw/ofwcall64.S (revision 272558) > +++ /usr/src/sys/powerpc/ofw/ofwcall64.S (working copy) > @@ -128,13 +128,22 @@ > bctrl > =20 > /* Reload stack pointer and MSR from the OFW stack */ > + isync > + isync > ld %r6,24(%r1) > + isync > + isync > ld %r2,16(%r1) > + isync > + isync > ld %r1,8(%r1) > + isync > + isync > =20 > /* Now set the real MSR */ > mtmsrd %r6 > isync > + isync > =20 > /* Sign-extend the return value from OF */ > extsw %r3,%r3 The result that I got was that the last isync above is where the SRR0 is = reported as pointing when the trap happens. (No multiple-fault problem = showed up so it did not point into the exception handling code.) With all the extra isyncs removed (the normal code having only one isync = in that area, the one just after the mtmsrd), the extsw instruction is = in that position and it is what SRR0 pointed to. So that aspect ended up = confirmed. The version of the code with the extra isyncs should have forced any of = the exceptions from the ld commands (and before) to happen before the = mtmsrd was executed. As near as I can tell the implication would be that = the mtmsrd itself is what is having an exception happen. SRR1: 0x1000000040101120 lr: 0x8a64e8 .ofwcall+0xa8 (i.e., just after the bctrl in both types = of code). =46rom all this I expect that ofwcall returned before the exception = happened. ctr: 0xff846d78 cr: 0x22000022 xer: 0 I expect that the reported dar and dsisr are garbage (probably a wrong = kind of trap to have them initialized). But they were listed as: dar: 0x810248fbc10250fb dsisr: 0xe102587f8802a648 I've no clue if openfirmware was well behaved about register values as = of when it returned to ofwcall. r6 in the list below does not look good = to me: a little more than r1's value, suggesting a stack address is = being displayed instead of an msrd value. But by the time of mtmsrd %r6 = execution r1 should no longer have the OFW stack address but one for the = kernel at the time. (Presumes openfirmware was well behaved.) r0: 0 r1: 0xbc0558 r2: 0xe18dd0 MP_ncpus r3: 0xd24450 r4: 0x8a64e8 .ofwcall+0xa8 (specific address could depend on other = variations in builds) r5: 0 r6: 0xbc0568 r7: 0xe5f63d ofw_real_mode r8: 0x1 r9: 0xe5f63d ofw_name_history_+0x15 (part of my crash information = dumping hacks) r10: 0x1c35ec0 r11: 0 r12: 0x22000022 r13: 0xddaf29 thread0 r14-r19: 0 r20: 0x10f6000 r21: 0x4 r22: 0x1801bd4 r23: 0x1803a28 r24: 0xc000000000008760 r25: 0xcd4a98 r26: 0xcf6758 r27: 0xcd4a98 r28: 0xe62690 emergency_buffer.7721+0x8 r29: 0x1874d0 ofw_name_history_pos (part of my information dumping = hacks) r30: 0x9000000000001032 r31: 0xc0000000000084a0 [ofw_name_history is how I earlier found the specific ofwcall that did = not return all the way without getting an associated exception. = ofw_name_history content is dumped by my DDB script that I forced to = exist and runs when the exception happens.] Now for the odd part of how I got to the above happening. Given the multiple-fault problem that was involved I decided to try to = get some information on which type(s) of exception(s) by making PC = values distinct: duplicating the code that contained the address being = reported so each use had its own copy. So I ended up with not just realtrap but realtrap1, realtrap2, and = realtrap3, for example, that look like: > +realtrap1: > +/* Test whether we already had PR set */ > + mfsrr1 %r1 > + mtcr %r1 > + mfsprg1 %r1 /* restore SP (might have been > + overwritten) */ > + bf 17,rt1_k_trap /* branch if PSL_PR is false = */ > + GET_CPUINFO(%r1) > + ld %r1,PC_CURPCB(%r1) > + mr %r27,%r28 /* Save LR, r29 */ > + mtsprg2 %r29 > + bl restore_kernsrs /* enable kernel mapping */ > + mfsprg2 %r29 > + mr %r28,%r27 > + FRAME_SETUP(PC_TEMPSAVE) > + ba trapagain > +rt1_k_trap: > + FRAME_SETUP(PC_TEMPSAVE) > + ba trapagain Since the original reports where for an address inside FRAME_SETUP code, = I needed distinct copies of FRAME_SETUP to have unique PCs for the = different uses. (I could have used realtrap instead of having realtrap3 but ended up = with realtrap unused.) The trapagain code was after the reported fault place and so was not = duplicated. generictrap also got its own copy of such code (no label). That left alitrap as the only use of the original s_trap code. (It is = the only bla style use of s_trap in the original code and so I left that = alone.) After these changes I got the Show Registers results that I reported = above instead of SRR0 values from one of the exception handler paths. = (That is not what I expected.) The detailed changes to trap_subr64.S = were: > Index: /usr/src/sys/powerpc/aim/trap_subr64.S > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/sys/powerpc/aim/trap_subr64.S (revision 272558) > +++ /usr/src/sys/powerpc/aim/trap_subr64.S (working copy) > @@ -583,7 +583,7 @@ > /* Try to detect a kernel stack overflow */ > mfsrr1 %r31 > mtcr %r31 > - bt 17,realtrap /* branch is user mode */ > + bt 17,realtrap1 /* branch is user mode */ > mfsprg1 %r31 /* get old SP */ > clrrdi %r31,%r31,12 /* Round SP down to nearest = page */ > sub. %r30,%r31,%r30 /* SP - DAR */ > @@ -590,7 +590,7 @@ > bge 1f > neg %r30,%r30 /* modulo value */ > 1: cmpldi %cr0,%r30,4096 /* is DAR within a page of SP? = */ > - bge %cr0,realtrap /* no, too far away. */ > + bge %cr0,realtrap2 /* no, too far away. */ > =20 > /* Now convert this DSI into a DDB trap. */ > GET_CPUINFO(%r1) > @@ -628,6 +628,68 @@ > mr %r28,%r27 > ba s_trap > =20 > +realtrap1: > +/* Test whether we already had PR set */ > + mfsrr1 %r1 > + mtcr %r1 > + mfsprg1 %r1 /* restore SP (might have been > + overwritten) */ > + bf 17,rt1_k_trap /* branch if PSL_PR is false = */ > + GET_CPUINFO(%r1) > + ld %r1,PC_CURPCB(%r1) > + mr %r27,%r28 /* Save LR, r29 */ > + mtsprg2 %r29 > + bl restore_kernsrs /* enable kernel mapping */ > + mfsprg2 %r29 > + mr %r28,%r27 > + FRAME_SETUP(PC_TEMPSAVE) > + ba trapagain > +rt1_k_trap: > + FRAME_SETUP(PC_TEMPSAVE) > + ba trapagain > + > + > +realtrap2: > +/* Test whether we already had PR set */ > + mfsrr1 %r1 > + mtcr %r1 > + mfsprg1 %r1 /* restore SP (might have been > + overwritten) */ > + bf 17,rt2_k_trap /* branch if PSL_PR is false = */ > + GET_CPUINFO(%r1) > + ld %r1,PC_CURPCB(%r1) > + mr %r27,%r28 /* Save LR, r29 */ > + mtsprg2 %r29 > + bl restore_kernsrs /* enable kernel mapping */ > + mfsprg2 %r29 > + mr %r28,%r27 > + FRAME_SETUP(PC_TEMPSAVE) > + ba trapagain > +rt2_k_trap: > + FRAME_SETUP(PC_TEMPSAVE) > + ba trapagain > + > +realtrap3: > +/* Test whether we already had PR set */ > + mfsrr1 %r1 > + mtcr %r1 > + mfsprg1 %r1 /* restore SP (might have been > + overwritten) */ > + bf 17,rt3_k_trap /* branch if PSL_PR is false = */ > + GET_CPUINFO(%r1) > + ld %r1,PC_CURPCB(%r1) > + mr %r27,%r28 /* Save LR, r29 */ > + mtsprg2 %r29 > + bl restore_kernsrs /* enable kernel mapping */ > + mfsprg2 %r29 > + mr %r28,%r27 > + FRAME_SETUP(PC_TEMPSAVE) > + ba trapagain > +rt3_k_trap: > + FRAME_SETUP(PC_TEMPSAVE) > + ba trapagain > + > + > /* > * generictrap does some standard setup for trap handling to minimize > * the code that need be installed in the actual vectors. It expects > @@ -666,6 +728,20 @@ > mfsrr1 %r31 > mtcr %r31 > =20 > + bf 17,gt_k_trap /* branch if PSL_PR is false = */ > + GET_CPUINFO(%r1) > + ld %r1,PC_CURPCB(%r1) > + mr %r27,%r28 /* Save LR, r29 */ > + mtsprg2 %r29 > + bl restore_kernsrs /* enable kernel mapping */ > + mfsprg2 %r29 > + mr %r28,%r27 > + FRAME_SETUP(PC_TEMPSAVE) > + ba trapagain > +gt_k_trap: > + FRAME_SETUP(PC_TEMPSAVE) > + ba trapagain > + > s_trap: > bf 17,k_trap /* branch if PSL_PR is false = */ > GET_CPUINFO(%r1) > @@ -785,7 +861,7 @@ > ld %r31,(PC_DBSAVE+CPUSAVE_R31)(%r1) > mtsprg3 %r31 /* SPRG3 was clobbered by = FRAME_LEAVE */ > mfsprg1 %r1 > - b realtrap > + b realtrap3 > dbleave: > FRAME_LEAVE(PC_DBSAVE) > rfid >=20 Reverting this one file to the original code goes back to the historical = exception-in-exception-handler report by DDB's Show Register. =3D=3D=3D Mark Millard markmi at dsl-only.net
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?477A81CF-3222-4462-B25D-F46F0AA09D3B>