Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 30 Sep 2014 13:42:39 -0700
From:      Mark Millard <markmi@dsl-only.net>
To:        Nathan Whitehorn <nwhitehorn@freebsd.org>
Cc:        FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>
Subject:   Re: backtrace information from the 2nd(?) most common boot crash place on PowerMac G5's: just after real memory = ... (... MB)
Message-ID:  <4215450E-4C67-4B68-9370-846F23D4789F@dsl-only.net>
In-Reply-To: <542AC7C2.6050309@freebsd.org>
References:  <D5DC1914-5F1E-426E-821E-766CE943F82F@dsl-only.net> <542AC7C2.6050309@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
While crashing between the the real memory message place and the avail =
message message place in the sequence of messages has been the second =
most common place in the message sequence to fail, it has been rather =
rare: In months I've only seen it a few times despite all my reboots =
from the primary crash place issue and for deliberate testing/evidence =
finding about the boot crashes. (My primary FreeBSD activity is =
exploring FreeBSD via investigating the problems I have, primarily the =
early boot crash.)

I've seen it crash there on a variety of versions since I've been =
updating regularly but it crashes there only rarely. Only in recent =
times have I been building from source instead of using the MANIFEST and =
*.txz files with bsdinstall or before that using the .iso images. It =
crashed there back before I'd ever installed a kernel or world via my =
own build.

Of course with the DDB dump hack in place I get more information as =
things are now.

Unfortunately with the rarity I'm not able to effectively test if a =
specific installation/version/build/... has the problem or not. The best =
that I've got is to report the information I get on the rare occasion it =
does fail.

The 3 PowerMac G5's have: 8 GB (Dual processor), 12 GB (Quad core), 16 =
GB (Quad core). I have tended to use the 16 GB one primarily/mostly. But =
it sounds like I should switch to one of the others as the primary for a =
few months to see how things go for this issue.

While I think that I've seen that stopping place on more than one of the =
G5's it is possible that I'm wrong about that given the rarity. If I'm =
right about it then I may never have seen the problem on the 8 GB Dual =
processor one: more likely the two Quad cores.  But, again, I'm not =
sure. I tend to use the Dual processor one the least by a noticeable =
amount, however.

I certainly have seen the primary crash relative to message timing =
(before the Copyright notice) on all 3 G5's ever since I started =
exploring FreeBSD. Of course only with the DDB dump hack in place do I =
have evidence of just where those crashes happen internally.

I have reported one backtrace that is earlier then the first ofwcall =
with pmap_bootstrapped!=3D0.  It is the only example I have of that so =
far. Again: before the DDB hack I'd not have the evidence to make the =
distinction in place and it seems too rare to deliberately test for any =
specific build/version having the problem.

I've not tested if the .iso's still have the during-openfirmware loading =
boot-hang problem on the G5's in some time. So I do not know the status =
for that.



I'm really curious what the explanation is for the first ofwcall with =
pmap_bootstrapped!=3D0 sometimes failing and sometimes not. And =
similarly for other variability --but the other crashes seem to rare to =
have much chance of learning the answer.





=3D=3D=3D
Mark Millard
markmi@dsl-only.net

On Sep 30, 2014, at 8:09 AM, Nathan Whitehorn <nwhitehorn@freebsd.org> =
wrote:

How much RAM is in the machine? I've never ever heard of this happening =
before and have been using one of these daily for four years. Clearly, =
there's something special about your configuration. This error, in =
particular, means that the direct map has been evicted from the page =
table. I can't imagine any possible way for that to happen; it's =
basically the least likely fault that I can think of and almost =
certainly indicates memory corruption or a hardware fault. Do you see =
this with an unmodified 10.1-BETA2 kernel?
-Nathan

On 09/27/14 00:47, Mark Millard wrote:
> The following includes backtrace information from the 2nd most common =
boot crash place in the boot message sequence on PowerMac G5's: just =
after it reports
>=20
> real memory =3D ... (... MB).
>=20
> Classically it reports data storage interrupt here and it did again. =
But more is dumped in my current configuration than before.
>=20
> FreeBSD FBSDG5M1 10.1-BETA2 FreeBSD 10.1-BETA2 #16 r271944M: Fri Sep =
26 23:01:54 PDT 2014     root@FBSDG5M1:/usr/obj/usr/src/sys/GENERIC64  =
powerpc
>=20
> but with options DDB and DGB in GENERIC64, WITH_DEBUG_FILES=3D, =
WITHOUT_CLANG=3D, WIHT_DEBUG=3D in /etc/make.conf. Also: DDB hacked to =
dump various things automatically so it happens during early boot =
crashes/hangs.
>=20
> The information reported was...
>=20
> fatal kernel trap
>=20
> exception =3D 0x300 (data storage interrupt)
> virtual address =3D 0x75e0000
> dsisr =3D 0x42000000
> curthread =3D 0xdbc290
> pid =3D 0, comm =3D
>=20
> srr0: 0x885608 .moea64_zero_page+1ac (a dcbz r0,r10)
> lr: 0x8ba31c .pmap_zero_page+0x7c
> ctr: 0x88545c .moea64_zero_page
>=20
> 0x8ba318: .pmap_zero_page+0x78
> 0x84167c: .kmem_back+0x2d0
> 0x8417fc: .kmem_malloc+0x7c
> 0x840dc4: .vm_ksubmap_init+0x8c
> 0x882130: .cpu_startup+0x10c
> 0x4d9c10: .mi_startup+0x10c
> btext+0xbc (???)
>=20
> r0: 0x1
> r1: 0xc000000000008740
> r2: 0xd19468
> r3: 0xe4d3a8 mmu_kernel_obj
> r4: 0xc000000002bfc290
> r5: 0xc7dfa0 mmu_zero_page_desc
> r6: 0xc000000000063af8
> r7: 0x2
> r8: 0xe0c310 vm_phys_free_queues
> r9: 0x80 dbsize+0xc
> r10: 0x7f5e0000
> r11: 0x80 dbsize_0xc
> r12: 0x24042042
> r13: 0xdbc290 thread0
> r14-r19: all 0
> r20: 0x10c2000
> r21: 0x4
> r22: 0x163f000
> r23: 0xc0000000d03fd000
> r24: 0x3800
> r25: 0x262
> r26: 0x400000000000000
> r27: 0xe4d3a8 mmu_kernel_obj
> r28: 0xc000000002bfc290
> r29: 0xc000000002bfc290 (yes: again)
> r30: 0x75e0000
> r31: 0xc000000000008740
>=20
> cr: 0x44042044
> xer: 0
> (I did not write down srr1. Drat.)
>=20
> =3D=3D=3D
> Mark Millard
> markmi at dsl-only.net
>=20
> _______________________________________________
> freebsd-ppc@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-ppc
> To unsubscribe, send any mail to "freebsd-ppc-unsubscribe@freebsd.org"
>=20





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4215450E-4C67-4B68-9370-846F23D4789F>