Date: Sun, 8 Feb 2015 01:16:31 -0800 From: Mark Millard <markmi@dsl-only.net> To: Nathan Whitehorn <nwhitehorn@freebsd.org> Cc: FreeBSD PowerPC ML <freebsd-ppc@freebsd.org> Subject: Re: HEADS UP: powerpc64 kernel format change [booted a PowerMac G5 quad-core][__syncicache is running at the time of the crashes] Message-ID: <E527514A-96F6-4794-8F03-504E51EC8CCB@dsl-only.net> In-Reply-To: <449E0C48-B57D-4873-B2E7-BC217D891897@dsl-only.net> References: <335C8DCD-33DF-4430-A0FA-77669C513C61@dsl-only.net> <449E0C48-B57D-4873-B2E7-BC217D891897@dsl-only.net>
index | next in thread | previous in thread | raw e-mail
I've narrowed down greatly where the crashes happen, which need not be where root-cause is: that could be earlier.
In the following code [I'm using 10.1-RELEASE-p5 for reference here]
$ svnlite diff sys/boot/ofw/libofw/ppc64_elf_freebsd.c
Index: sys/boot/ofw/libofw/ppc64_elf_freebsd.c
===================================================================
--- sys/boot/ofw/libofw/ppc64_elf_freebsd.c (revision 277808)
+++ sys/boot/ofw/libofw/ppc64_elf_freebsd.c (working copy)
@@ -59,7 +59,11 @@
* be done by the kernel after relocation.
*/
if (!strcmp((*result)->f_type, "elf kernel"))
+{
+printf("ppc64_ofw_elf_loadfile before __syncicache\n");
__syncicache((void *) (*result)->f_addr, (*result)->f_size);
+printf("ppc64_ofw_elf_loadfile after __syncicache\n");
+}
return (0);
}
for a directly-bootable (no crash) kernel-build both printf's are displayed. (The above code part of /boot/loader .)
But for the kernels that I build that fail to directly boot only the first of the two printf's is displayed when a direct boot is attempted: Openfirmware's notice with %SRR0 and %SRR1 shows up after that instead of the text from the second printf.
Based on that much it looks like the crash is either in evaluating the arguments to __syncicache or happens during __syncicache's execution, not after.
Changing the first printf to something like the sequence:
printf("ppc64_ofw_elf_loadfile before __syncicache\n");
printf("(void*)result: %p\n",(void*)result);
printf("(void*)(*result): %p\n",(void*)(*result));
printf("(void*)(*result)->f_addr: %p\n",(void*)(*result)->f_addr);
printf("(*result)->f_size : 0x%lx\n",(*result)->f_size);
Shows that all of the stages print before the crash happens, answering the question about evaluation of the arguments: there is no problem evaluating them.
So the crashes are strictly during __syncicache's activity.
Only (*result)->f_size varies between the 3 examples that I've used:
10.1-RELEASE-p5 variant without VERBOSE_SYSINIT (and BOOTVERBOSE, BOOTHOWTO). (boots fine)
10.1-RELEASE-p5 variant with VERBOSE_SYSINIT (and BOOTVERBOSE, BOOTHOWTO). (crashes)
10.1-STABLE (-r278028) variant without VERBOSE_SYSINIT (and BOOTVERBOSE, BOOTHOWTO). (crashes)
What the above printf's reported was:
(void*)result: 0x1c35b48
(void*)(*result): 0x1ebc0
(*result)->f_addr: 0x100000
10.1-RELEASE-p5 variant without VERBOSE_SYSINIT: (*result)->f_size: 0x1014b80
10.1-RELEASE-p5 variant with VERBOSE_SYSINIT: (*result)->f_size: 0x1014bb0
10.1-STABLE (-r278028) variant without VERBOSE_SYSINIT: (*result)->f_size: 0x10175d0
(Listed in increasing order.)
As I remember for the last two the crash report listed %SRR0: 0x1c2785c for both. The "without VERBOSE_SYSINIT" one above does not crash and for it the "ppc64_ofw_elf_loadfile after __syncicache" message shows up as it should.
===
Mark Millard
markmi at dsl-only.net
On 2015-Feb-7, at 09:43 AM, Mark Millard <markmi at dsl-only.net> wrote:
Correction to earlier Email: VERBOSE_SYSINIT with DDB (and GDB) all enabled (indirectly booted via using kernel10.1RE) got 0x1c277ec for the %SRR0 value, not 0x1c277fc. So slightly different than Kernel10.1S's 0x1c277fc (for this 10.1-STABLE variant). (I looked at the wrong notes when composing the original Email.)
More comparisons of kernel build options:
VERBOSE_SYSINIT enabled with DDB (and GDB) disabled still has the booting problem for my 10.1-RELEASE-p5 variant. It also still has the 0x1c277ec for the %SRR0 value.
For VERBOSE_SYSINIT disabled (DDB and GDB enabled) directly booted...
Preloaded elf kernel "/boot/kernel/kernel" at 0x1106000.
...
real memory = 17152118784 (16357 MB)
available KVA = 7222611967 (6888 MB)
Physical memory chunk(s):
0x0000000000024000 - 0x00000000000fffff, 901120 bytes (220 pages)
0x0000000001115000 - 0x00000000017fffff, 7254016 bytes (1771 pages)
0x0000000001814000 - 0x0000000001bfffff, 4112384 bytes (1004 pages)
0x0000000001c3d000 - 0x0000000001c3cfff, 0 bytes (0 pages)
0x0000000004cbd000 - 0x000000000fffffff, 187969536 bytes (45891 pages)
0x0000000020000000 - 0x000000007f5effff, 1600061440 bytes (390640 pages)
0x0000000100000000 - 0x0000000466827fff, 14604730368 bytes (3565608 pages)
0x0000000200000000 - 0x00000001ffffffff, 0 bytes (0 pages)
0x0000000300000000 - 0x00000002ffffffff, 0 bytes (0 pages)
0x0000000400000000 - 0x00000003ffffffff, 0 bytes (0 pages)
avail memory = 16374190080 (15615 MB)
So 0x1c277ec is between the two:
0x0000000001814000 - 0x0000000001bfffff, 4112384 bytes (1004 pages)
0x0000000001c3d000 - 0x0000000001c3cfff, 0 bytes (0 pages)
(But I do not know what most of the regions and holes are supposed to be.)
VERBOSE_SYSINIT, DDB, and GDB enabled but indirectly booted via kernel10.1RE (via /boot/loader.conf's kernel="kernel10.1RE"), stopping, unloading, then doing "boot kernel":
Preloaded elf kernel "/boot/kernel/kernel" at 0x1116000.
...
real memory = 17152118784 (16357 MB)
available KVA = 7222611967 (6888 MB)
Physical memory chunk(s):
0x0000000000024000 - 0x00000000000fffff, 901120 bytes (220 pages)
0x0000000001105000 - 0x0000000001114fff, 65536 bytes (16 pages)
0x0000000001125000 - 0x00000000017fffff, 7188480 bytes (1755 pages)
0x0000000001814000 - 0x0000000001bfffff, 4112384 bytes (1004 pages)
0x0000000001c3d000 - 0x0000000001c3cfff, 0 bytes (0 pages)
0x0000000004cbd000 - 0x000000000fffffff, 187969536 bytes (45891 pages)
0x0000000020000000 - 0x000000007f5effff, 1600061440 bytes (390640 pages)
0x0000000100000000 - 0x0000000466827fff, 14604730368 bytes (3565608 pages)
0x0000000200000000 - 0x00000001ffffffff, 0 bytes (0 pages)
0x0000000300000000 - 0x00000002ffffffff, 0 bytes (0 pages)
0x0000000400000000 - 0x00000003ffffffff, 0 bytes (0 pages)
avail memory = 16374190080 (15615 MB)
===
Mark Millard
markmi at dsl-only.net
On 2015-Feb-7, at 03:49 AM, Mark Millard <markmi at dsl-only.net> wrote:
Nathan, you had the below written about my problems with booting my builds of, say, 10.1-STABLE (kernel="kernel10.1S" in /boot/loaderl.conf) without involving the kernel from my build of 10.1-RELEASE-p5 (kernel="kernel10.1RE" or sometimes kernel="kernel" in /boot/loader.conf), where kernel="kernel10.1RE" in /boot/loader.conf boots just fine...
> So this has to be some kind of icache issue. If you unload and reload
> the *same* kernel, does it also help?
> -Nathan
(Part of the evidence was: Using kernel="kernel10.1RE" in /boot/loader.conf, stopping at the 10sec prompt, unloading, and doing "boot kernel 10.1S" lets my 10.1-STABLE builds boot that will not boot directly.)
Well I've got a little more information from a different direction: A way to create the problem when building my 10.1-RELEASE-p5 kernel is to enable VERBOSE_SYSINIT. More specifically the comparison/contrast I've done so far is...
I added the following 3 lines to my GENERIC64vtsc for my 10.1-RELEASE-p5 source tree (no other changes elsewhere at all)
options VERBOSE_SYSINIT
options BOOTVERBOSE=1
options BOOTHOWTO=RB_VERBOSE
and rebuilt kernel the via KERNCONF=GENERIC64vtsc INSTKERNNAME=kernel the resulting kernel load fails if referenced by /boot/loader.conf via kernel="kernel" line. The %SRR0 address value listed is the same as for kernel10.1S: 1c277fc. But booting using kernel="kernel10.1RE" in /boot/loader.conf, stopping at the 10sec wait, unloading, and typing "boot kernel" boots fine --just like "boot kernel10.1S".
Note: GENERIC64vtsc has option DDB enabled (and GBD too). (This is associated my with my information gathering for early G5 boot crashes/hangups.)
Note: This is the first time I've ever tried any of those 3 options. My kernel10.1S build was not based on them.
Then I changed the 3 lines by just commenting out the first of the 3 that I had added
#options VERBOSE_SYSINIT
options BOOTVERBOSE=1
options BOOTHOWTO=RB_VERBOSE
and rebuilt via KERNCONF=GENERIC64vtsc INSTKERNNAME=kernel again. The resulting /boot/kernel/... boots just fine when kernel="kernel" is used in /boot/loader.conf : no need for using kernel10.1RE or for stopping to do anything special.
===
Mark Millard
markmi at dsl-only.net
home |
help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E527514A-96F6-4794-8F03-504E51EC8CCB>
