Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 8 Feb 2015 01:16:31 -0800
From:      Mark Millard <markmi@dsl-only.net>
To:        Nathan Whitehorn <nwhitehorn@freebsd.org>
Cc:        FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>
Subject:   Re: HEADS UP: powerpc64 kernel format change [booted a PowerMac G5 quad-core][__syncicache is running at the time of the crashes]
Message-ID:  <E527514A-96F6-4794-8F03-504E51EC8CCB@dsl-only.net>
In-Reply-To: <449E0C48-B57D-4873-B2E7-BC217D891897@dsl-only.net>
References:  <335C8DCD-33DF-4430-A0FA-77669C513C61@dsl-only.net> <449E0C48-B57D-4873-B2E7-BC217D891897@dsl-only.net>

index | next in thread | previous in thread | raw e-mail

I've narrowed down greatly where the crashes happen, which need not be where root-cause is: that could be earlier.

In the following code [I'm using 10.1-RELEASE-p5 for reference here]

$ svnlite diff sys/boot/ofw/libofw/ppc64_elf_freebsd.c 
Index: sys/boot/ofw/libofw/ppc64_elf_freebsd.c
===================================================================
--- sys/boot/ofw/libofw/ppc64_elf_freebsd.c	(revision 277808)
+++ sys/boot/ofw/libofw/ppc64_elf_freebsd.c	(working copy)
@@ -59,7 +59,11 @@
 	 * be done by the kernel after relocation.
 	 */
 	if (!strcmp((*result)->f_type, "elf kernel"))
+{
+printf("ppc64_ofw_elf_loadfile before __syncicache\n");
 		__syncicache((void *) (*result)->f_addr, (*result)->f_size);
+printf("ppc64_ofw_elf_loadfile after __syncicache\n");
+}
 	return (0);
 }

for a directly-bootable (no crash) kernel-build both printf's are displayed. (The above code part of /boot/loader .)

But for the kernels that I build that fail to directly boot only the first of the two printf's is displayed when a direct boot is attempted: Openfirmware's notice with %SRR0 and %SRR1 shows up after that instead of the text from the second printf.

Based on that much it looks like the crash is either in evaluating the arguments to __syncicache or happens during __syncicache's execution, not after.



Changing the first printf to something like the sequence:

printf("ppc64_ofw_elf_loadfile before __syncicache\n");
printf("(void*)result: %p\n",(void*)result);
printf("(void*)(*result): %p\n",(void*)(*result));
printf("(void*)(*result)->f_addr: %p\n",(void*)(*result)->f_addr);
printf("(*result)->f_size : 0x%lx\n",(*result)->f_size);

Shows that all of the stages print before the crash happens, answering the question about evaluation of the arguments: there is no problem evaluating them.

So the crashes are strictly during __syncicache's activity.

Only (*result)->f_size varies between the 3 examples that I've used:

10.1-RELEASE-p5 variant without VERBOSE_SYSINIT (and BOOTVERBOSE, BOOTHOWTO). (boots fine)
10.1-RELEASE-p5 variant with VERBOSE_SYSINIT (and BOOTVERBOSE, BOOTHOWTO). (crashes)
10.1-STABLE (-r278028) variant without VERBOSE_SYSINIT (and BOOTVERBOSE, BOOTHOWTO). (crashes)

What the above printf's reported was:

(void*)result:    0x1c35b48
(void*)(*result):   0x1ebc0
(*result)->f_addr: 0x100000

10.1-RELEASE-p5 variant without VERBOSE_SYSINIT: (*result)->f_size:        0x1014b80
10.1-RELEASE-p5 variant with VERBOSE_SYSINIT: (*result)->f_size:           0x1014bb0
10.1-STABLE (-r278028) variant without VERBOSE_SYSINIT: (*result)->f_size: 0x10175d0

(Listed in increasing order.)

As I remember for the last two the crash report listed %SRR0: 0x1c2785c for both. The "without VERBOSE_SYSINIT" one above does not crash and for it the "ppc64_ofw_elf_loadfile after __syncicache" message shows up as it should.

===
Mark Millard
markmi at dsl-only.net

On 2015-Feb-7, at 09:43 AM, Mark Millard <markmi at dsl-only.net> wrote:

Correction to earlier Email: VERBOSE_SYSINIT with DDB (and GDB) all enabled (indirectly booted via using kernel10.1RE) got 0x1c277ec for the %SRR0 value, not 0x1c277fc. So slightly different than Kernel10.1S's 0x1c277fc (for this 10.1-STABLE variant). (I looked at the wrong notes when composing the original Email.)

More comparisons of kernel build options:

VERBOSE_SYSINIT enabled with DDB (and GDB) disabled still has the booting problem for my 10.1-RELEASE-p5 variant. It also still has the 0x1c277ec for the %SRR0 value.


For VERBOSE_SYSINIT disabled (DDB and GDB enabled) directly booted...

Preloaded elf kernel "/boot/kernel/kernel" at 0x1106000.
...
real memory  = 17152118784 (16357 MB)
available KVA = 7222611967 (6888 MB)
Physical memory chunk(s):
0x0000000000024000 - 0x00000000000fffff, 901120 bytes (220 pages)
0x0000000001115000 - 0x00000000017fffff, 7254016 bytes (1771 pages)
0x0000000001814000 - 0x0000000001bfffff, 4112384 bytes (1004 pages)
0x0000000001c3d000 - 0x0000000001c3cfff, 0 bytes (0 pages)
0x0000000004cbd000 - 0x000000000fffffff, 187969536 bytes (45891 pages)
0x0000000020000000 - 0x000000007f5effff, 1600061440 bytes (390640 pages)
0x0000000100000000 - 0x0000000466827fff, 14604730368 bytes (3565608 pages)
0x0000000200000000 - 0x00000001ffffffff, 0 bytes (0 pages)
0x0000000300000000 - 0x00000002ffffffff, 0 bytes (0 pages)
0x0000000400000000 - 0x00000003ffffffff, 0 bytes (0 pages)
avail memory = 16374190080 (15615 MB)

So 0x1c277ec is between the two:

0x0000000001814000 - 0x0000000001bfffff, 4112384 bytes (1004 pages)
0x0000000001c3d000 - 0x0000000001c3cfff, 0 bytes (0 pages)

(But I do not know what most of the regions and holes are supposed to be.)

VERBOSE_SYSINIT, DDB, and GDB enabled but indirectly booted via kernel10.1RE (via /boot/loader.conf's kernel="kernel10.1RE"), stopping, unloading, then doing "boot kernel":

Preloaded elf kernel "/boot/kernel/kernel" at 0x1116000.
...
real memory  = 17152118784 (16357 MB)
available KVA = 7222611967 (6888 MB)
Physical memory chunk(s):
0x0000000000024000 - 0x00000000000fffff, 901120 bytes (220 pages)
0x0000000001105000 - 0x0000000001114fff, 65536 bytes (16 pages)
0x0000000001125000 - 0x00000000017fffff, 7188480 bytes (1755 pages)
0x0000000001814000 - 0x0000000001bfffff, 4112384 bytes (1004 pages)
0x0000000001c3d000 - 0x0000000001c3cfff, 0 bytes (0 pages)
0x0000000004cbd000 - 0x000000000fffffff, 187969536 bytes (45891 pages)
0x0000000020000000 - 0x000000007f5effff, 1600061440 bytes (390640 pages)
0x0000000100000000 - 0x0000000466827fff, 14604730368 bytes (3565608 pages)
0x0000000200000000 - 0x00000001ffffffff, 0 bytes (0 pages)
0x0000000300000000 - 0x00000002ffffffff, 0 bytes (0 pages)
0x0000000400000000 - 0x00000003ffffffff, 0 bytes (0 pages)
avail memory = 16374190080 (15615 MB)


===
Mark Millard
markmi at dsl-only.net

On 2015-Feb-7, at 03:49 AM, Mark Millard <markmi at dsl-only.net> wrote:

Nathan, you had the below written about my problems with booting my builds of, say, 10.1-STABLE  (kernel="kernel10.1S" in /boot/loaderl.conf) without involving the kernel from my build of 10.1-RELEASE-p5 (kernel="kernel10.1RE" or sometimes kernel="kernel" in /boot/loader.conf), where kernel="kernel10.1RE" in /boot/loader.conf boots just fine...

> So this has to be some kind of icache issue. If you unload and reload 
> the *same* kernel, does it also help?
> -Nathan

(Part of the evidence was: Using kernel="kernel10.1RE" in /boot/loader.conf, stopping at the 10sec prompt, unloading, and doing "boot kernel 10.1S" lets my 10.1-STABLE builds boot that will not boot directly.)


Well I've got a little more information from a different direction: A way to create the problem when building my 10.1-RELEASE-p5 kernel is to enable VERBOSE_SYSINIT. More specifically the comparison/contrast I've done so far is...



I added the following 3 lines to my GENERIC64vtsc for my 10.1-RELEASE-p5 source tree (no other changes elsewhere at all)

options 	VERBOSE_SYSINIT
options 	BOOTVERBOSE=1
options 	BOOTHOWTO=RB_VERBOSE

and rebuilt kernel the via KERNCONF=GENERIC64vtsc INSTKERNNAME=kernel the resulting kernel load fails if referenced by /boot/loader.conf via kernel="kernel" line. The %SRR0 address value listed is the same as for kernel10.1S: 1c277fc. But booting using kernel="kernel10.1RE" in /boot/loader.conf, stopping at the 10sec wait, unloading, and typing "boot kernel" boots fine --just like "boot kernel10.1S".

Note: GENERIC64vtsc has option DDB enabled (and GBD too). (This is associated my with my information gathering for early G5 boot crashes/hangups.)

Note: This is the first time I've ever tried any of those 3 options. My kernel10.1S build was not based on them.

Then I changed the 3 lines by just commenting out the first of the 3 that I had added

#options 	VERBOSE_SYSINIT
options 	BOOTVERBOSE=1
options 	BOOTHOWTO=RB_VERBOSE

and rebuilt via KERNCONF=GENERIC64vtsc INSTKERNNAME=kernel again. The resulting /boot/kernel/... boots just fine when kernel="kernel" is used in /boot/loader.conf : no need for using kernel10.1RE or for stopping to do anything special.



===
Mark Millard
markmi at dsl-only.net





home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E527514A-96F6-4794-8F03-504E51EC8CCB>