Date: Sat, 27 May 2017 00:42:19 -0700 From: Mark Millard <markmi@dsl-only.net> To: Justin Hibbits <jhibbits@FreeBSD.org>, FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>, freebsd-hackers@freebsd.org Subject: Re: A good backtrace from a head -r317820 powerpc random/periodic panic: execution of garbage at 0x0090a030 (in .hash section) [better bt; more] Message-ID: <62BF8E69-E7E6-4C4F-AB33-38B03E903CDA@dsl-only.net> In-Reply-To: <8C88BB6F-E747-42A1-9DDC-35EC6D865141@dsl-only.net> References: <1CE8346B-04F3-48AB-A3E9-6DF3B86B8D1A@dsl-only.net> <8C88BB6F-E747-42A1-9DDC-35EC6D865141@dsl-only.net>
next in thread | previous in thread | raw e-mail | index | archive | help
[Top post of the answer to what is wrong. I have submitted 219589 for this.] TARGET_ARCH=3Dpowerpc64 got a fix to bugzilla 205458, avoiding inappropriate restoration of openfirmware's sprg0 value. It turns out that TARGET_ARCH=3Dpowerpc needs to detect when it is running on powerpc64 and do the same thing if it is to avoid trashing memory and running unreliably on PowerMac G5's. The issue is that for powerpc64 it is inappropriate to restore the sprg0 value to its openfirmware value. This is because the FreeBSD real-mode handling is involved instead of the openfirmware's original virtual mode, making openfirware's value simply inappropriate. Quoting Nathan W. from Comment #4 of 205458: > Where this explodes is if OF uses an unmapped SLB entry. > The SLB fault handler runs in real mode and refers to the > PCPU pointer in SPRG0, which blows up the kernel. Having > a value of SPRG0 that works for the kernel is less fatal > than preserving OF's value in this case. I know that part of the code does detect the powerpc64 context vs. not and does things differently to emulate being powerpc like on powerpc64 (such as limiting RAM use as a consequence). The powerpc64 vs. not status needs to be recorded and used to control a sprg0 restoration choice: avoid restoring openfirmware's value on powerpc64; otherwise restore it. =3D=3D=3D Mark Millard markmi at dsl-only.net On 2017-May-26, at 10:29 PM, Mark Millard <markmi@dsl-only.net> wrote: [Additional information that does not need to interlace with the prior material, so see after. A somewhat better backtrace reported by ddb. And so on.] On 2017-May-26, at 7:14 PM, Mark Millard <markmi at dsl-only.net> wrote: > I lucked out and got a vmcore.9 for a random > panic that I could manage to backtrace for > one of my test builds of -r317820. It appears > that not all that much happened before it got > the panic so much context was better preserved > this time. >=20 > (I do not explore from ddb as I've had that > panic and mess up the dump just made by > replacing it. So this is a manual backtrace > from the debug.minidump=3D0 style vmcore.9 > file. objdump was used on the > /boot/kernel/kernel to find code.) >=20 > Being able to see the problem is very > sensitive to kernel memory layout. This > is why I'm sticking with -r317820 built > production style: the kind of context > the problem was first observed in. > Attempting a debug kernel build simply > did not repeat the problem for days > (vs. the usual hours for builds like > this). >=20 >=20 > So below is the backtrace: >=20 > (I do not show what trap_fatal calls: > starting with trap_fatal and going toward > larger memory addresses. . .) >=20 > [vmcore.9's > offset > in file > when no > 0x prefix] >=20 > 013e83b0 df 5e 55 50 00 8f 34 5c fa 50 05 af fa 50 05 af = |.^UP..4\.P...P..| > 0x008f3454 <trap+0x1228> mr r3,r26 > 0x008f3458 <trap+0x122c> bl 008f2030 <trap_fatal> > 0x008f345c <trap+0x1230> b 008f34c8 <trap+0x129c> >=20 > 013e83c0 fa 50 05 af fa 50 05 af fa 50 05 af fa 50 05 af = |.P...P...P...P..| > * > 013e83e0 df 5e 54 00 fa 50 05 af fa 50 05 af fa 50 05 af = |.^T..P...P...P..| > 013e83f0 fa 50 05 af fa 50 05 af 00 d1 ca ac df 5e 54 00 = |.P...P.......^T.| > 013e8400 df 5e 54 20 00 53 3a d0 fa 50 05 af fa 50 05 af |.^T = .S:..P...P..| > 013e8410 fa 50 05 af fa 50 05 af 00 d1 ca ac df 5e 54 20 = |.P...P.......^T | > 013e8420 df 5e 54 d0 00 53 3a d0 00 00 00 00 00 00 00 00 = |.^T..S:.........| > 013e8430 00 00 00 00 00 00 00 00 00 d1 ca ac 00 00 00 0f = |................| > 013e8440 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 = |................| > * > 013e8460 00 00 00 54 7f ff ff ff 00 00 00 00 ff ff ff aa = |...T............| > 013e8470 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 = |................| > * > 013e8490 00 d4 c4 5c 00 d4 c4 5c 00 17 99 dd 00 17 9a 5c = |...\...\.......\| > 013e84a0 00 00 17 9a 00 d4 c4 6c df 5e 54 ec 00 00 00 54 = |.......l.^T....T| > 013e84b0 df 5e 54 d0 7f ff ff ff 05 aa 36 c0 05 aa 39 e8 = |.^T.......6...9.| > 013e84c0 00 00 00 00 00 00 00 00 00 d1 ca ac df 5e 54 d0 = |.............^T.| > 013e84d0 df 5e 55 80 00 53 3a d0 05 91 d0 00 05 91 d3 28 = |.^U..S:........(| > 013e84e0 df 5e 55 00 00 00 00 00 00 d1 ca ac 00 00 00 0f = |.^U.............| > 013e84f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 = |................| > 013e8500 00 00 00 00 00 00 00 00 00 d4 bd ec 00 cb 98 98 = |................| > 013e8510 00 d4 c4 5c 00 d4 c4 5c 00 17 99 f8 00 17 99 f8 = |...\...\........| > 013e8520 00 00 17 99 fa 69 b8 79 00 17 9a 0a 00 00 17 99 = |.....i.y........| > 013e8530 f8 1c 2f cc df 5e 55 88 01 47 d6 c0 01 43 a2 00 = |../..^U..G...C..| > 013e8540 41 eb 30 00 0a 00 00 00 00 d2 6e 4c df 5e 55 50 = |A.0.......nL.^UP| >=20 > 013e8550 df 5e 55 80 00 8e 7d 40 df 5e 55 9c 00 00 00 28 = |.^U...}@.^U....(| > 0x008e7d28 <powerpc_interrupt+0x184> mfmsr r0 > 0x008e7d2c <powerpc_interrupt+0x188> or r0,r0,r9 > 0x008e7d30 <powerpc_interrupt+0x18c> mtmsr r0 > 0x008e7d34 <powerpc_interrupt+0x190> isync > 0x008e7d38 <powerpc_interrupt+0x194> mr r3,r25 > 0x008e7d3c <powerpc_interrupt+0x198> bl 008f222c <trap> > 0x008e7d40 <powerpc_interrupt+0x19c> lwz r11,0(r1) >=20 > 013e8560 00 00 00 00 00 00 17 9a 06 40 c2 a8 01 43 a2 00 = |.........@...C..| > 013e8570 41 eb 30 00 0a 00 00 00 00 00 00 00 00 08 10 32 = |A.0............2| >=20 > r0 r1 > 013e8580 df 5e 56 40 00 10 08 f8 00 00 00 04 df 5e 56 40 = |.^V@.........^V@| > The struct trapframe starts is 0x 013e8588 in the > vmcore.9 file. The 0x00108f8 is as shown below: >=20 > 0x001008ec <k_trap+0x118> isync > 0x001008f0 <trapagain> addi r3,r1,8 > 0x001008f4 <trapagain+0x4> bl 008e7ba4 <powerpc_interrupt> > 0x001008f8 <trapexit> mfmsr r3 > 0x001008fc <trapexit+0x4> andi. r3,r3,32767 >=20 > 013e8590 01 47 d6 c0 00 00 00 28 01 47 c6 c0 00 00 00 04 = |.G.....(.G......| > r2 r3 r4 r5 >=20 > 013e85a0 00 00 00 04 00 00 00 0f 00 00 00 00 00 d4 c0 3c = |...............<| > r6 r7 r8 r9 >=20 > 013e85b0 01 47 d6 c0 df 5e 56 80 14 3d 60 da 00 00 00 00 = |.G...^V..=3D`.....| > r10 r11 r12 r13 >=20 > 013e85c0 00 d4 bd ec 00 cb 98 98 00 d4 c4 5c 00 d4 c4 5c = |...........\...\| > r14 r15 r16 r17 >=20 > 013e85d0 00 17 99 f8 00 17 99 f8 00 00 17 99 fa 69 b8 79 = |.............i.y| > r18 r19 r20 r21 >=20 > 013e85e0 00 17 9a 0a 00 00 17 99 f8 1c 2f cc 00 00 17 9a = |........../.....| > r22 r32 r24 r25 >=20 > 013e85f0 06 40 c2 a8 01 43 a2 00 00 eb a7 80 01 47 d6 c0 = |.@...C.......G..| > r26 r27 r28 r29 >=20 > 013e8600 00 d1 ca ac df 5e 56 40 00 53 5a d0 20 00 90 44 = |.....^V@.SZ. ..D| > r30 r31 lr cr >=20 > xer ctr srr0 srr1 > 013e8610 00 00 00 00 00 00 00 00 00 90 a0 30 00 08 10 32 = |...........0...2| > Note: objdump shows no code at 0x0090a030. 0x0090a030 is > inside what objdump -x reports as the section .hash (Idx > 2). >=20 > exc dar dsisr (booke dbcer0) > 013e8620 00 00 07 00 41 eb 30 00 0a 00 00 00 01 47 c6 c0 = |....A.0......G..| > The 0x00007000 above is the framep->exc (exception code) > (program in this case). >=20 > 013e8630 00 eb a7 80 01 47 60 10 00 d1 ca ac df 5e 56 40 = |.....G`......^V@| >=20 > At this point the above does not match the below > part of the stack trace. >=20 > The lr part of struct trapframe was: 0x00535ad0 so > showing around that: >=20 > 00535ab8 <sched_affinity> stwu r1,-32(r1) > 00535abc <sched_affinity+0x4> mflr r0 > 00535ac0 <sched_affinity+0x8> stw r29,20(r1) > 00535ac4 <sched_affinity+0xc> stw r30,24(r1) > 00535ac8 <sched_affinity+0x10> stw r31,28(r1) > 00535acc <sched_affinity+0x14> stw r0,36(r1) > 00535ad0 <sched_affinity+0x18> mr r31,r1 > 00535ad4 <sched_affinity+0x1c> mr r29,r3 >=20 > Back to the stack backtrace. . . >=20 > 013e8640 df 5e 56 80 00 53 59 dc 00 17 99 f8 00 17 99 f8 = |.^V..SY.........| > 0x005359c8 <sched_add+0x18c> bl 008ea420 <spinlock_exit> > 0x005359cc <sched_add+0x190> mr r3,r28 > 0x005359d0 <sched_add+0x194> mr r4,r27 > 0x005359d4 <sched_add+0x198> mr r5,r25 > 0x005359d8 <sched_add+0x19c> bl 005356ec <tdq_add> > 0x005359dc <sched_add+0x1a0> mfsprg r9,0 >=20 > I show around 0x005356ec > from the sched_add+0x19c bl that is above > because the routine is not referenced > in the stack tracce but the above indicates > that it should have been called: > 0x005356ec <tdq_add> stwu r1,-32(r1) > 0x005356f0 <tdq_add+0x4> mflr r0 > 0x005356f4 <tdq_add+0x8> stw r28,16(r1) > 0x005356f8 <tdq_add+0xc> stw r29,20(r1) > 0x005356fc <tdq_add+0x10> stw r30,24(r1) > 0x00535700 <tdq_add+0x14> stw r31,28(r1) > 0x00535704 <tdq_add+0x18> stw r0,36(r1) > . . . >=20 > Back to the stack backtrace again. . . >=20 > 013e8650 00 00 17 99 fa 69 b8 79 00 17 9a 0a 00 00 00 04 = |.....i.y........| > 013e8660 f8 1c 2f cc 00 00 17 9a 06 40 c2 a8 01 43 a2 00 = |../......@...C..| > 013e8670 01 47 c6 c0 01 47 60 10 00 d1 b4 30 df 5e 56 80 = |.G...G`....0.^V.| >=20 > 013e8680 df 5e 56 b0 00 4a 87 8c 00 d2 5b 10 00 00 00 04 = |.^V..J....[.....| > 0x004a8780 <intr_event_schedule_thread+0xc4> mr r3,r28 > 0x004a8784 <intr_event_schedule_thread+0xc8> li r4,4 > 0x004a8788 <intr_event_schedule_thread+0xcc> bl 0053583c = <sched_add> > 0x004a878c <intr_event_schedule_thread+0xd0> lwz r9,0(r28) >=20 > 013e8690 df 5e 56 b0 00 00 17 9a 06 40 c2 a8 01 43 a2 00 = |.^V......@...C..| > 013e86a0 00 00 00 00 01 46 81 c0 00 d1 b4 30 df 5e 56 b0 = |.....F.....0.^V.| >=20 > 013e86b0 df 5e 56 f0 00 4a 97 0c 00 00 00 00 00 00 00 04 = |.^V..J..........| > 0x004a9700 <swi_sched+0xa4> bl 005000ec <critical_exit> > 0x004a9704 <swi_sched+0xa8> mr r3,r27 > 0x004a9708 <swi_sched+0xac> bl 004a86bc = <intr_event_schedule_thread> > 0x004a970c <swi_sched+0xb0> lwz r11,0(r1) >=20 > 013e86c0 df 5e 56 e0 01 47 d6 c0 01 47 d6 c0 01 45 4d 40 = |.^V..G...G...EM@| > 013e86d0 df 5e 56 f0 00 8e a4 44 06 40 c2 a8 00 00 17 9a = |.^V....D.@......| > 013e86e0 78 00 00 00 00 e9 56 00 00 d1 c8 20 df 5e 56 f0 = |x.....V.... .^V.| >=20 > 013e86f0 df 5e 57 50 00 51 79 6c df 5e 58 78 01 47 d6 c0 = |.^WP.Qyl.^Xx.G..| > 0x00517960 <callout_process+0x420> lwz r3,264(r29) > 0x00517964 <callout_process+0x424> li r4,0 > 0x00517968 <callout_process+0x428> bl 004a965c <swi_sched> > 0x0051796c <callout_process+0x42c> lwz r11,0(r1) >=20 > 013e8700 01 47 d7 b8 00 00 00 00 00 d1 ab 24 00 00 00 04 = |.G.........$....| > 013e8710 00 c9 66 bc 00 c4 5d 48 00 c9 66 bc 00 d4 c5 3c = |..f...]H..f....<| > 013e8720 00 d0 53 00 00 eb a7 80 00 00 00 01 00 00 00 00 = |..S.............| > 013e8730 df 5e 59 8c 00 00 00 00 df 5e 58 78 00 00 17 99 = |.^Y......^Xx....| > 013e8740 f8 1c 2f cc d0 01 dd 00 00 d2 5b 10 df 5e 57 50 = |../.......[..^WP| >=20 > 013e8750 df 5e 57 a0 00 8a b2 70 df 5e 57 60 df 5e 57 60 = |.^W....p.^W`.^W`| > 0x008ab264 <handleevents+0x2a4> mr r3,r27 > 0x008ab268 <handleevents+0x2a8> mr r4,r28 > 0x008ab26c <handleevents+0x2ac> bl 00517540 <callout_process> > 0x008ab270 <handleevents+0x2b0> li r3,0 >=20 > 013e8760 df 5e 57 a0 df 5e 58 78 05 86 37 00 00 00 00 04 = |.^W..^Xx..7.....| > 013e8770 00 00 00 00 05 9b f2 00 00 c9 66 bc 01 47 d6 c0 = |..........f..G..| > 013e8780 df 5e 59 8c 00 f6 1d 10 00 00 17 99 f8 1c 2f cc = |.^Y.........../.| > 013e8790 d0 01 dd 00 d0 01 dd 30 00 d2 5b 10 df 5e 57 a0 = |.......0..[..^W.| >=20 > 013e87a0 df 5e 58 20 00 8a d1 10 00 d2 6e 5c df 5e 57 b0 |.^X = ......n\.^W.| > 0x008ad100 <timercb+0x4b8> mr r3,r26 > 0x008ad104 <timercb+0x4bc> mr r4,r27 > 0x008ad108 <timercb+0x4c0> li r5,0 > 0x008ad10c <timercb+0x4c4> bl 008aafc0 <handleevents> > 0x008ad110 <timercb+0x4c8> lwz r11,0(r1) >=20 > 013e87b0 df 5e 57 e0 00 4a 96 00 00 00 17 99 00 00 00 00 = |.^W..J..........| > 013e87c0 f8 1c 2f cc 0a 3f a0 12 df 5e 58 78 05 86 37 00 = |../..?...^Xx..7.| > 013e87d0 01 48 b1 00 05 86 37 80 00 d4 bd ec 00 cb 98 98 = |.H....7.........| > 013e87e0 00 c9 66 bc 00 c4 5d 48 00 c9 66 bc 00 d4 c5 3c = |..f...]H..f....<| > 013e87f0 df 5e 59 e0 00 eb a7 80 00 c9 66 bc 01 47 d6 c0 = |.^Y.......f..G..| > 013e8800 df 5e 59 8c df 5e 58 78 01 47 d6 c0 00 00 00 00 = |.^Y..^Xx.G......| > 013e8810 00 f6 1d 10 00 00 00 01 00 d2 6b c8 df 5e 58 20 = |..........k..^X | >=20 > 013e8820 df 5e 58 40 00 8e 1e 48 00 00 00 00 00 eb a7 80 = |.^X@...H........| > So 0x13e8820 from vmcore.9 has start of struct trapeframe. > The 0x8e1e48 is from: > 0x008e1e34 <decr_intr+0xe0> lwz r0,56(r28) > 0x008e1e38 <decr_intr+0xe4> mtctr r0 > 0x008e1e3c <decr_intr+0xe8> mr r3,r28 > 0x008e1e40 <decr_intr+0xec> lwz r4,64(r28) > 0x008e1e44 <decr_intr+0xf0> bctrl > 0x008e1e48 <decr_intr+0xf4> addi r29,r29,-1 >=20 > 013e8830 01 47 d7 94 00 00 00 01 00 d2 6e 4c df 5e 58 40 = |.G........nL.^X@| > 013e8840 df 5e 58 70 00 8e 7c 9c 00 d1 ca ac df 5e 58 50 = |.^Xp..|......^XP| > 013e8850 00 cd f0 74 00 00 00 01 00 00 00 01 00 eb a7 80 = |...t............| > 013e8860 41 eb 30 00 0a 00 00 00 00 00 00 00 00 00 90 32 = |A.0............2| > 013e8870 df 5e 59 30 00 10 08 f8 00 04 90 32 df 5e 59 30 = |.^Y0.......2.^Y0| > 013e8880 01 47 d6 c0 00 00 00 00 0d 0b bf e3 00 00 00 00 = |.G..............| > 013e8890 0d 0b bf e3 00 19 eb 7c 00 00 00 00 00 00 00 44 = |.......|.......D| > 013e88a0 01 fc a0 55 00 00 90 32 d0 01 dd 00 00 00 00 00 = |...U...2........| > 013e88b0 00 d4 bd ec 00 cb 98 98 00 c9 66 bc 00 c4 5d 48 = |..........f...]H| > 013e88c0 00 c9 66 bc 00 d4 c5 3c df 5e 59 e0 00 eb a7 80 = |..f....<.^Y.....| > 013e88d0 00 c9 66 bc 01 47 d6 c0 df 5e 59 8c 00 00 00 01 = |..f..G...^Y.....| > 013e88e0 00 00 00 01 00 eb a7 80 00 00 00 00 00 8e 3b f8 = |..............;.| >=20 > srr0 > 013e88f0 00 d2 6b f0 df 5e 59 30 00 8e 3c 14 40 00 00 42 = |..k..^Y0..<.@..B| > The 0x00833c14 from the trap frame is the srr0 (conceptual lr): > 0x008e3c04 <cpu_idle_60x+0xc> stw r31,28(r1) > 0x008e3c08 <cpu_idle_60x+0x10> stw r0,36(r1) > 0x008e3c0c <cpu_idle_60x+0x14> mr r31,r1 > 0x008e3c10 <cpu_idle_60x+0x18> bcl- 20,4*cr7+so,008e3c14 = <cpu_idle_60x+0x1c> > 0x008e3c14 <cpu_idle_60x+0x1c> mflr r30 > 0x008e3c18 <cpu_idle_60x+0x20> lwz r0,-32(r30) >=20 > 013e8900 20 00 00 00 00 8e 3b f8 00 8e 3c 80 00 00 90 32 | = .....;...<....2| >=20 > exc > 013e8910 00 00 09 00 41 eb 30 00 0a 00 00 00 00 00 00 00 = |....A.0.........| > The 0x00009000 above is the framep->exc (exception code) > (decrementer in this case). >=20 > 013e8920 eb 10 25 b5 55 7e c4 22 00 00 00 00 00 00 00 04 = |..%.U~."........| >=20 > 013e8930 df 5e 59 50 00 00 00 01 00 00 00 01 00 eb a7 80 = |.^YP............| > (trap [above] means no lr filled in here) >=20 > 013e8940 00 00 00 00 00 d4 ca 34 00 d2 6b f0 df 5e 59 50 = |.......4..k..^YP| >=20 > 013e8950 df 5e 59 70 00 8e 31 9c 00 00 00 02 00 eb a7 80 = |.^Yp..1.........| > 0x008e318c <cpu_idle+0x48> bl 008ad8b8 <cpu_idleclock> > 0x008e3190 <cpu_idle+0x4c> lwz r29,0(r29) > 0x008e3194 <cpu_idle+0x50> mtctr r29 > 0x008e3198 <cpu_idle+0x54> bctrl > 0x008e319c <cpu_idle+0x58> bl 008ad7b8 <cpu_activeclock> >=20 > 013e8960 00 f2 d5 fc 00 00 00 01 00 d1 ca ac df 5e 59 70 = |.............^Yp| >=20 > 013e8970 df 5e 5a 50 00 53 6e 7c fa 50 05 af fa 50 05 af = |.^ZP.Sn|.P...P..| > 0x00536e78 <sched_idletd+0x4d0> bl 008e3144 <cpu_idle> > 0x00536e7c <sched_idletd+0x4d4> stw r28,136(r27) >=20 >=20 > 013e8980 fa 50 05 af fa 50 05 af fa 50 05 af ff ff ff fe = |.P...P...P......| > 013e8990 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff = |................| > 013e89a0 ff ff ff ff ff ff ff ff ff ff ff ff fa 50 05 af = |.............P..| > 013e89b0 fa 50 05 af 00 00 00 02 ff ff ff ff 00 00 01 f0 = |.P..............| > 013e89c0 ff ff ff fe ff ff ff ff ff ff ff ff ff ff ff ff = |................| > 013e89d0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff = |................| > 013e89e0 ff ff ff fe ff ff ff ff ff ff ff ff ff ff ff ff = |................| > 013e89f0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff = |................| > 013e8a00 df 5e 5a 20 fa 50 05 af 00 00 00 00 00 00 00 00 |.^Z = .P..........| > 013e8a10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 = |................| > * > 013e8a30 00 00 00 00 00 53 69 a8 df 5e 5a 98 00 00 00 00 = |.....Si..^Z.....| > 013e8a40 01 47 96 e0 01 47 d6 c0 00 d1 b3 70 df 5e 5a 50 = |.G...G.....p.^ZP| >=20 > 013e8a50 df 5e 5a 80 00 4a 3c b4 df 5e 5a 60 df 5e 5a 60 = |.^Z..J<..^Z`.^Z`| > 0x004a3ca0 <fork_exit+0xe4> bl 008ea420 <spinlock_exit> > 0x004a3ca4 <fork_exit+0xe8> mr r3,r27 > 0x004a3ca8 <fork_exit+0xec> mr r4,r26 > 0x004a3cac <fork_exit+0xf0> mtctr r25 > 0x004a3cb0 <fork_exit+0xf4> bctrl > 0x004a3cb4 <fork_exit+0xf8> lwz r0,108(r28) >=20 > 013e8a60 df 5e 5a 80 00 00 00 00 00 00 00 00 00 00 00 00 = |.^Z.............| > 013e8a70 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 = |................| >=20 > 013e8a80 00 00 00 00 00 8f 18 d0 00 53 69 a8 00 00 00 00 = |.........Si.....| > 0x008f18c0 <fork_trampoline> lwz r3,8(r1) > 0x008f18c4 <fork_trampoline+0x4> lwz r4,12(r1) > 0x008f18c8 <fork_trampoline+0x8> lwz r5,16(r1) > 0x008f18cc <fork_trampoline+0xc> bl 004a3bbc <fork_exit> > 0x008f18d0 <fork_trampoline+0x10> addi r1,r1,16 > 0x008f18d4 <fork_trampoline+0x14> b 001008f8 <trapexit> >=20 >=20 > So that is an example context for the failure. >=20 > It has taken weeks to get this. It may be some > time before I get another for comparison/contrast. >=20 > And sticking with the same build vs. trying some more > to find a better way to get more evidence is not > obvious to me at this point. I should have mentioned that this is TARGET_ARCH=3Dpowerpc but used on a so-called PowerMac G5 "Quad Core". I've had 2 more examples, with the same 0x0090a030 srr0 and such (vmcore.0 and vmcore.1) (I have added one more little block of code for detecting an earlier problem symptom not being currently seen so the build is slight different from the earlier report.) Looking around in vmcore.0 I find 3 examples of "00 90 a0 30" in areas overlapping with objdump -x's coverage. . . 00c77fd0 00 00 00 00 00 00 00 00 00 90 a0 30 00 08 10 32 = |...........0...2| [ ] (from sorted objdump -x output:) 00c775a8 l O .data 00000dc0 seqprog 00c78368 l O .data 0000000c seeprom_long_ewen [ ] 00f65820 41 eb 30 00 0a 00 00 00 00 90 a0 30 00 08 10 32 = |A.0........0...2| . . . 00f65870 00 90 a0 30 00 08 10 32 00 00 00 00 df 5d 30 00 = |...0...2.....]0.| [ ] (from sorted objdump -x output:) 00f64780 g O .bss 00020000 __pcpu 00f84780 l O .bss 00000004 ap_letgo For vmcore.1 I went ahead and tried exploring some with ddb --and had no new panics. . . (Hand transcriptions of pictures) fatal kernel trap: exception =3D 0x700 (program) srr0 =3D 0x90a030 srr1 =3D 0x81032 lr =3D 0x535ad0 curthread =3D 0x147d6c0 pic =3D 11, comm =3D idle: cpu0 [ thread pid 11 tid 100003 ] Stopped at _etext+0xb8fc: illegal instruction 0 db> bt 0xdf5e55d0: at sched_wakeup+0xa4 0xdf5e55f0: at setrunnable+0x9c 0xdf5e5610: at sleepq_resume_thread+0x17c 0xdf5e5640: at sleepq_timeout+0xc8 0xdf5e5680: at softclock_call_cc+0x1f0 0xdf5e56f0: at callout_process+0x27c 0xdf5e57a0: at timercb+0x4c4 0xdf5e5820: at decr_intr+0xf0 0xdf5e5840: at powerpc_interrupt_0xf4 0xdf5e5870: at kernel DECR trap by cpu_idle_60x+0x88 (so: srr0) srr1=3D0x9032 r1 =3D0xdf5e5930 cr =3D0x40000042 xer =3D0x20000000 ctr =3D0x8e3bf8 saved LR(0xfffffffd) is invalid. db> show reg (but reformatted r0 =3D0x4 r1 =3D0xdf5e5590 r2 =3D0x147d6c0 r3 =3D0x54 testppc64size+0x34 r4 =3D0x591d000 r5 =3D0 r6 =3D0 r7 =3D0xf r8 =3D0 r9 =3D0xd4c03c cold r10 =3D0x147d6c0 r11 =3D0xdf5e55d0 r12 =3D0 r13 =3D0 r14 =3D0xd4bdec sdt_probe_func r15 =3D0xcb9898 std_lockstat___spin__release r16 =3D0xd4c45c callwheelmask r17 =3D0xd4c45c callwheelmask r18 =3D0x55925 r19 =3D0x559a4 r20 =3D0x559 dsmisssize+0x469 r21 =3D0x591d000 r22 =3D0x566430 sleeppq_timeout r23 =3D0x114 dsmisssize+0x24 r24 =3D0 r25 =3D0 r26 =3D0x1 r27 =3D0 r28 =3D0xeba780 tdq_cpu r29 =3D0x147d6c0 r30 =3D0xd1caac r31 =3D0xdf5e5590 srr0 =3D0x90a030 srr1 =3D0x81032 lr =3D0x535ad0 shed_affinity+0x18 ctr =3D0 cr =3D0x20009034 xer =3D0 dar =3D0x419df5d4 dsisr=3D0x24000000 _etext+0xb8fc: illegal instruction 0 Just for completeness: acttrace also showed: Tracing command less pid 1144 tid 100150 td 0x5bc8a20 (CPU 3) 0xd25a59f0: at powrpc_dispatch_intr+0xc8 0xd25a5a20: at openpic_dispatch+0x90 0xd25a5a50: at powerpc_interrupt+0xc0 0xd25a5a80: at user EXI trap by 0x181c68c (so: ssr0) r1 =3D0xffffdb30 cr =3D0x44020624 xer=3D0 ctr=3D0x41989570 Tracing command idle pid 11 tid 100004 td 0x147d360 (CPU 1) saved LR(0x4c) in invalid Tracing command idle pid 11 tid 100005 td 0x147d360 (CPU 2) saved LR(0x4c) in invalid =3D=3D=3D Mark Millard markmi at dsl-only.net _______________________________________________ freebsd-ppc@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-ppc To unsubscribe, send any mail to "freebsd-ppc-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?62BF8E69-E7E6-4C4F-AB33-38B03E903CDA>