From owner-freebsd-ppc@FreeBSD.ORG Sun Feb 8 09:16:37 2015 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 28F59A9C for ; Sun, 8 Feb 2015 09:16:37 +0000 (UTC) Received: from asp.reflexion.net (outbound-242.asp.reflexion.net [69.84.129.242]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D7632FB0 for ; Sun, 8 Feb 2015 09:16:35 +0000 (UTC) Received: (qmail 27375 invoked from network); 8 Feb 2015 09:16:34 -0000 Received: from unknown (HELO rtc-sm-01.app.dca.reflexion.local) (10.81.150.1) by 0 (rfx-qmail) with SMTP; 8 Feb 2015 09:16:34 -0000 Received: by rtc-sm-01.app.dca.reflexion.local (Reflexion email security v7.40.1) with SMTP; Sun, 08 Feb 2015 04:16:34 -0500 (EST) Received: (qmail 2952 invoked from network); 8 Feb 2015 09:16:33 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (DHE-RSA-AES256-SHA encrypted) SMTP; 8 Feb 2015 09:16:33 -0000 X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-67-189-19-145.hsd1.or.comcast.net [67.189.19.145]) by iron2.pdx.net (Postfix) with ESMTPSA id 43DC6B1E001; Sun, 8 Feb 2015 01:16:32 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) Subject: Re: HEADS UP: powerpc64 kernel format change [booted a PowerMac G5 quad-core][__syncicache is running at the time of the crashes] From: Mark Millard In-Reply-To: <449E0C48-B57D-4873-B2E7-BC217D891897@dsl-only.net> Date: Sun, 8 Feb 2015 01:16:31 -0800 Content-Transfer-Encoding: quoted-printable Message-Id: References: <335C8DCD-33DF-4430-A0FA-77669C513C61@dsl-only.net> <449E0C48-B57D-4873-B2E7-BC217D891897@dsl-only.net> To: Nathan Whitehorn X-Mailer: Apple Mail (2.2070.6) Cc: FreeBSD PowerPC ML X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 08 Feb 2015 09:16:37 -0000 I've narrowed down greatly where the crashes happen, which need not be = where root-cause is: that could be earlier. In the following code [I'm using 10.1-RELEASE-p5 for reference here] $ svnlite diff sys/boot/ofw/libofw/ppc64_elf_freebsd.c=20 Index: sys/boot/ofw/libofw/ppc64_elf_freebsd.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- sys/boot/ofw/libofw/ppc64_elf_freebsd.c (revision 277808) +++ sys/boot/ofw/libofw/ppc64_elf_freebsd.c (working copy) @@ -59,7 +59,11 @@ * be done by the kernel after relocation. */ if (!strcmp((*result)->f_type, "elf kernel")) +{ +printf("ppc64_ofw_elf_loadfile before __syncicache\n"); __syncicache((void *) (*result)->f_addr, = (*result)->f_size); +printf("ppc64_ofw_elf_loadfile after __syncicache\n"); +} return (0); } for a directly-bootable (no crash) kernel-build both printf's are = displayed. (The above code part of /boot/loader .) But for the kernels that I build that fail to directly boot only the = first of the two printf's is displayed when a direct boot is attempted: = Openfirmware's notice with %SRR0 and %SRR1 shows up after that instead = of the text from the second printf. Based on that much it looks like the crash is either in evaluating the = arguments to __syncicache or happens during __syncicache's execution, = not after. Changing the first printf to something like the sequence: printf("ppc64_ofw_elf_loadfile before __syncicache\n"); printf("(void*)result: %p\n",(void*)result); printf("(void*)(*result): %p\n",(void*)(*result)); printf("(void*)(*result)->f_addr: %p\n",(void*)(*result)->f_addr); printf("(*result)->f_size : 0x%lx\n",(*result)->f_size); Shows that all of the stages print before the crash happens, answering = the question about evaluation of the arguments: there is no problem = evaluating them. So the crashes are strictly during __syncicache's activity. Only (*result)->f_size varies between the 3 examples that I've used: 10.1-RELEASE-p5 variant without VERBOSE_SYSINIT (and BOOTVERBOSE, = BOOTHOWTO). (boots fine) 10.1-RELEASE-p5 variant with VERBOSE_SYSINIT (and BOOTVERBOSE, = BOOTHOWTO). (crashes) 10.1-STABLE (-r278028) variant without VERBOSE_SYSINIT (and BOOTVERBOSE, = BOOTHOWTO). (crashes) What the above printf's reported was: (void*)result: 0x1c35b48 (void*)(*result): 0x1ebc0 (*result)->f_addr: 0x100000 10.1-RELEASE-p5 variant without VERBOSE_SYSINIT: (*result)->f_size: = 0x1014b80 10.1-RELEASE-p5 variant with VERBOSE_SYSINIT: (*result)->f_size: = 0x1014bb0 10.1-STABLE (-r278028) variant without VERBOSE_SYSINIT: = (*result)->f_size: 0x10175d0 (Listed in increasing order.) As I remember for the last two the crash report listed %SRR0: 0x1c2785c = for both. The "without VERBOSE_SYSINIT" one above does not crash and for = it the "ppc64_ofw_elf_loadfile after __syncicache" message shows up as = it should. =3D=3D=3D Mark Millard markmi at dsl-only.net On 2015-Feb-7, at 09:43 AM, Mark Millard wrote: Correction to earlier Email: VERBOSE_SYSINIT with DDB (and GDB) all = enabled (indirectly booted via using kernel10.1RE) got 0x1c277ec for the = %SRR0 value, not 0x1c277fc. So slightly different than Kernel10.1S's = 0x1c277fc (for this 10.1-STABLE variant). (I looked at the wrong notes = when composing the original Email.) More comparisons of kernel build options: VERBOSE_SYSINIT enabled with DDB (and GDB) disabled still has the = booting problem for my 10.1-RELEASE-p5 variant. It also still has the = 0x1c277ec for the %SRR0 value. For VERBOSE_SYSINIT disabled (DDB and GDB enabled) directly booted... Preloaded elf kernel "/boot/kernel/kernel" at 0x1106000. ... real memory =3D 17152118784 (16357 MB) available KVA =3D 7222611967 (6888 MB) Physical memory chunk(s): 0x0000000000024000 - 0x00000000000fffff, 901120 bytes (220 pages) 0x0000000001115000 - 0x00000000017fffff, 7254016 bytes (1771 pages) 0x0000000001814000 - 0x0000000001bfffff, 4112384 bytes (1004 pages) 0x0000000001c3d000 - 0x0000000001c3cfff, 0 bytes (0 pages) 0x0000000004cbd000 - 0x000000000fffffff, 187969536 bytes (45891 pages) 0x0000000020000000 - 0x000000007f5effff, 1600061440 bytes (390640 pages) 0x0000000100000000 - 0x0000000466827fff, 14604730368 bytes (3565608 = pages) 0x0000000200000000 - 0x00000001ffffffff, 0 bytes (0 pages) 0x0000000300000000 - 0x00000002ffffffff, 0 bytes (0 pages) 0x0000000400000000 - 0x00000003ffffffff, 0 bytes (0 pages) avail memory =3D 16374190080 (15615 MB) So 0x1c277ec is between the two: 0x0000000001814000 - 0x0000000001bfffff, 4112384 bytes (1004 pages) 0x0000000001c3d000 - 0x0000000001c3cfff, 0 bytes (0 pages) (But I do not know what most of the regions and holes are supposed to = be.) VERBOSE_SYSINIT, DDB, and GDB enabled but indirectly booted via = kernel10.1RE (via /boot/loader.conf's kernel=3D"kernel10.1RE"), = stopping, unloading, then doing "boot kernel": Preloaded elf kernel "/boot/kernel/kernel" at 0x1116000. ... real memory =3D 17152118784 (16357 MB) available KVA =3D 7222611967 (6888 MB) Physical memory chunk(s): 0x0000000000024000 - 0x00000000000fffff, 901120 bytes (220 pages) 0x0000000001105000 - 0x0000000001114fff, 65536 bytes (16 pages) 0x0000000001125000 - 0x00000000017fffff, 7188480 bytes (1755 pages) 0x0000000001814000 - 0x0000000001bfffff, 4112384 bytes (1004 pages) 0x0000000001c3d000 - 0x0000000001c3cfff, 0 bytes (0 pages) 0x0000000004cbd000 - 0x000000000fffffff, 187969536 bytes (45891 pages) 0x0000000020000000 - 0x000000007f5effff, 1600061440 bytes (390640 pages) 0x0000000100000000 - 0x0000000466827fff, 14604730368 bytes (3565608 = pages) 0x0000000200000000 - 0x00000001ffffffff, 0 bytes (0 pages) 0x0000000300000000 - 0x00000002ffffffff, 0 bytes (0 pages) 0x0000000400000000 - 0x00000003ffffffff, 0 bytes (0 pages) avail memory =3D 16374190080 (15615 MB) =3D=3D=3D Mark Millard markmi at dsl-only.net On 2015-Feb-7, at 03:49 AM, Mark Millard wrote: Nathan, you had the below written about my problems with booting my = builds of, say, 10.1-STABLE (kernel=3D"kernel10.1S" in = /boot/loaderl.conf) without involving the kernel from my build of = 10.1-RELEASE-p5 (kernel=3D"kernel10.1RE" or sometimes kernel=3D"kernel" = in /boot/loader.conf), where kernel=3D"kernel10.1RE" in = /boot/loader.conf boots just fine... > So this has to be some kind of icache issue. If you unload and reload=20= > the *same* kernel, does it also help? > -Nathan (Part of the evidence was: Using kernel=3D"kernel10.1RE" in = /boot/loader.conf, stopping at the 10sec prompt, unloading, and doing = "boot kernel 10.1S" lets my 10.1-STABLE builds boot that will not boot = directly.) Well I've got a little more information from a different direction: A = way to create the problem when building my 10.1-RELEASE-p5 kernel is to = enable VERBOSE_SYSINIT. More specifically the comparison/contrast I've = done so far is... I added the following 3 lines to my GENERIC64vtsc for my 10.1-RELEASE-p5 = source tree (no other changes elsewhere at all) options VERBOSE_SYSINIT options BOOTVERBOSE=3D1 options BOOTHOWTO=3DRB_VERBOSE and rebuilt kernel the via KERNCONF=3DGENERIC64vtsc INSTKERNNAME=3Dkernel = the resulting kernel load fails if referenced by /boot/loader.conf via = kernel=3D"kernel" line. The %SRR0 address value listed is the same as = for kernel10.1S: 1c277fc. But booting using kernel=3D"kernel10.1RE" in = /boot/loader.conf, stopping at the 10sec wait, unloading, and typing = "boot kernel" boots fine --just like "boot kernel10.1S". Note: GENERIC64vtsc has option DDB enabled (and GBD too). (This is = associated my with my information gathering for early G5 boot = crashes/hangups.) Note: This is the first time I've ever tried any of those 3 options. My = kernel10.1S build was not based on them. Then I changed the 3 lines by just commenting out the first of the 3 = that I had added #options VERBOSE_SYSINIT options BOOTVERBOSE=3D1 options BOOTHOWTO=3DRB_VERBOSE and rebuilt via KERNCONF=3DGENERIC64vtsc INSTKERNNAME=3Dkernel again. = The resulting /boot/kernel/... boots just fine when kernel=3D"kernel" is = used in /boot/loader.conf : no need for using kernel10.1RE or for = stopping to do anything special. =3D=3D=3D Mark Millard markmi at dsl-only.net