Date: Tue, 17 Feb 2015 21:34:53 -0800 From: Mark Millard <markmi@dsl-only.net> To: FreeBSD PowerPC ML <freebsd-ppc@freebsd.org> Cc: Justin Hibbits <chmeeedalf@gmail.com> Subject: Re: PowerMac G5 powerpc64: new context where repeatedly booting varies between failing and working Message-ID: <36C14790-8E66-4C9D-9F29-A137FB49439D@dsl-only.net> In-Reply-To: <5FE82152-BBF7-4C6D-932D-AEE70546CACA@dsl-only.net> References: <7CA43EE3-8C11-4FBD-9F8A-42DF08B82362@dsl-only.net> <ABDD60F1-72C0-41E0-8DFB-4CFDCA9ACA82@dsl-only.net> <C355D814-D486-4644-B9C6-92992092FD55@dsl-only.net> <5FE82152-BBF7-4C6D-932D-AEE70546CACA@dsl-only.net>
next in thread | previous in thread | raw e-mail | index | archive | help
[I had sent Nathan W. and Justin H. a picture of a display of a = boot-time corrupted memory region. This time I tried to find the start = and end of the region and I'm documenting in a textual form more = appropriate to the list. I have also removed prior Email history from = this Email but there is much context one must check that history for.] Several of the new values put in place by the .got memory corruption = reported below match up with .opd or other types of addresses reported = by objdump for my /boot/kernel10.1S/kernel. They are noted below as I = list detailed differences. I made the early-boot-crash display a larger range and the span of the = corruption seemed to go as follows for the corruption of part of the = .got area. Also I induced a deference of the bad pointer as soon as it = is discovered after the OF_peer(0) in question returns so later code = would not be involved when it crashes. (Crash early, crash often...) Overall structure: 0xd2da37 and before as far as I looked: no corruption found. The area from 0xd2da38-0xd2dc9F: largely corrupted. 0x268 or 616 bytes = or so in this corrupted range. 616=3D77*8. After that range: good again as far as I looked. The details: Warning: The below is based on hand transcribed information from screen = pictures that I took. Showing pair of lines (good then corrupted), using x/x style lines: 0xd2da30: 0, b4fd2c, 0, b4fd70 0xd2da30: 0, b4fd2c, 0, 0 0xd2da40: 0, e28948, 0, e1e460 0xd2da40: 0, 24000042, 0, d00058 (24000042 looks like a cr value?) (0000000000d00058 l .opd 0000000000000018 = ofw_rendezvous_dispatch) 0xd2da50: 0, bc7de8, 0, bc7e08 0xd2da50: 0, cde110, c0000000, 8740 (0xc000000000008740 looks like a stack address?) (0000000000cde110 g F .opd 0000000000000018 = smp_no_rendevous_barrier) 0xd2da60: 0, cd8470, 0, bd2608 0xd2da60: 0, 1, 0, c3a30c (0000000000c3a30c g .data 0000000000000000 ofw_sprg0_save) 0xd2da70: 0, bb5ea0, 0, b70870 0xd2da70: 0, 1c35ec0, 0, 0 0xd2da80: 0, c49918, 0, bc7e18 0xd2da80: 0, 44000022, 0, de4b30 (44000022 looks like a cr value?) (0000000000de4b30 g O .bss 0000000000000460 thread0) 0xd2da90: 0, b720a0, 0, b71370 0xd2da90: 900000000, 1032, 0, ff846d78 (9000000000001032 looks like a SRR1 value.) (ff846d78 is openfirmware entry point?) 0xd2daa0: 0, bc7e30, 0, bc7e58 0xd2daa0: 0, e39080, 100000000, 3030 (0000000000e39080 g O .bss 0000000000020000 __pcpu) (1000000000003030 looks like a SRR1 value?) 0xd2dab0: 0, bc7e80, 0, bc7eb0 0xd2dab0: c0000000, 83b0, 0, c3a280 (0xc0000000000083b0 looks like a stack address?) (c3a280 is inside my PowerMac G5 specific hack's ofwstk area: c392a0 up = to 0x3a2a0) (I've been gathering evidence about early-boot G5 crashes.) 0xd2dac0: 0, bc7ed0, 0, cf2960 0xd2dac0: 0, c40000, 0, c40000 0xd2dad0: 0, bc7f00, 0, bc7f28 0xd2dad0: 0, c40000, 0, c40000 0xd2dae0: 0, b72400, 0, bc7f28 0xd2dae0: c0000000, 8740, 0, cde110 (0xc000000000008740 looks like a stack address?) (0000000000cde110 g F .opd 0000000000000018 = smp_no_rendevous_barrier) 0xd2daf0: 0, cf2b28, 0, b716a0 0xd2daf0: 0, d00058, 0, cde110 (d00058 was also at 0xd2da4c and was followed by cde110 there.) (0000000000cde110 g F .opd 0000000000000018 = smp_no_rendevous_barrier) 0xd2db00: 0, cf2b88, 0, cf2b70 0xd2db00: 0, e6c280, 0, 0 (e6c280 is inside the emergency_buffer.7752 area: e6c278 up to e6c378) 0xd2db10: 0, cf2b58, 0, 8480 0xd2db10: 900000000, 1032, c0000000, 8740 (9000000000001032 looks like a SRR1 value?) (0xc000000000008740 looks like a stack address?) 0xd2db20: 0, c2d920, 0, cf2b10 0xd2db20: 0, c2d920, 0, cf2b10 (yep: unchanged!) 0xd2db30: 0, b71718, 0, c49888 0xd2db30: 0, ff846734, 10000000, 3030 (ff846734 would seem to be an openfirmware code address?) (1000000000003030 looks like a SRR1 value?) 0xd2db40: 0, c498a0, 0, c54000 0xd2db40: 0, c498a0, 0, ff846d78 (Yep: c498a0 was unchanged) (ff846d78 is openfirmware entry point?) 0xd2db50: 0, e313a8, 0, e31608 0xd2db50: 24000042, e313a8, 0, 0 (24000042 looks like a cr value?) (Trying to store to address 0x2400004200e313a8 for a specific type of 10.1-STABLE build is how the problem was originally noticed.) 0xd2db60: 0, c31f80, 0, bc81e8 0xd2db60: 0, c31f80, 0, 0 (Yep: 0x0000000000c31f80 is unchanged.) 0xd2db70: 0, e31408, 0, bc8228 0xd2db70: 200000, e31408, 0, bc8228 (Yep: Only the 0x200000 was a change.) 0xd2db80: 0, c32488, 0, bc8238 0xd2db80: 0, 1, 10000000, 3030 (1000000000003030 looks like a SRR1 value?) 0xd2db90: 0, e1e460, 0, c31fc0 0xd2db90: 0, 0, 0, 7ff7e800 0xd2dba0: 0, e31608, 0, bc8260 0xd2dba0: 0, 1000000a, 0, bc8260 (Yep: 0x0000000000bc8260 unchanged.) 0xd2dbb0: 0, e1e460, 0, e1fa60 0xd2dbb0: 0, e1e460, 0, e1fa60 (yep: unchanged!) 0xd2dbc0: 0, bc8288, 0, c32488 0xd2dbc0: 111081, 0, fd3c2000, 0 (fd3c2000 in openfirmware area?) 0xd2dbd0: 0, e3153c, 0, bc8298 0xd2dbd0: 10, 0, 0, 0 Now a few unchanged: 0xd2de0-0xd2dc1F Then a change in the pattern of corruptions for the rest of the = corrupted area: 0xd2dc20: 0, bc8288, 0, bc82e8 0xd2dc20: 0, bc8288, 127f500, bc82e8 Note how bc8288 and bc82e8 did not change. =46rom here on those two columns are not corrupted but the other two are. 0xd2dc30: 0, bc8300, 0, c32488 0xd2dc30: 8000000, bc8300, e7d540, c32488 0xd2dc40: 0, b4fef0, 0, e31558 0xd2dc40: ecc40, b4fef0, 84eec80, e31558 0xd2dc50: 0, bc8308, 0, cf2f00 0xd2dc50: 1e85440, bc8308, 8766200, cf2f00 0xd2dc60: 0, bc8310, 0, bc8350 0xd2dc60: fb9040, bc8310, 93bb000, bc8350 0xd2dc70: 0, c32038, 0, de5718 0xd2dc70: 94f6b00, c32038, 8632600, de5718 0xd2dc80: 0, de7768, 0, bc3760 0xd2dc80: 1fc0f40, de7768, 10f4b40, bc3760 0xd2dc90: 0, de7768, 0, e1fa00 0xd2dc90: 99e5700, cfc658, 228740, e1fa00 And after that things match for as far as I've looked: no corruptions. =3D=3D=3D Mark Millard markmi at dsl-only.net
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?36C14790-8E66-4C9D-9F29-A137FB49439D>