From owner-freebsd-ppc@FreeBSD.ORG Wed Feb 11 01:16:59 2015 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id F2F9C714 for ; Wed, 11 Feb 2015 01:16:59 +0000 (UTC) Received: from asp.reflexion.net (outbound-242.asp.reflexion.net [69.84.129.242]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id AEE3B7D2 for ; Wed, 11 Feb 2015 01:16:59 +0000 (UTC) Received: (qmail 25196 invoked from network); 11 Feb 2015 01:16:52 -0000 Received: from unknown (HELO mail-cs-02.app.dca.reflexion.local) (10.81.19.2) by 0 (rfx-qmail) with SMTP; 11 Feb 2015 01:16:52 -0000 Received: by mail-cs-02.app.dca.reflexion.local (Reflexion email security v7.40.1) with SMTP; Tue, 10 Feb 2015 20:16:52 -0500 (EST) Received: (qmail 12981 invoked from network); 11 Feb 2015 01:16:51 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (DHE-RSA-AES256-SHA encrypted) SMTP; 11 Feb 2015 01:16:51 -0000 X-No-Relay: not in my network Received: from [192.168.1.8] (c-67-189-19-145.hsd1.or.comcast.net [67.189.19.145]) by iron2.pdx.net (Postfix) with ESMTPSA id D4DA31C405E for ; Tue, 10 Feb 2015 17:16:45 -0800 (PST) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: Fixing powerpc64 /boot/loader's kernel page handing: suggestions? Message-Id: Date: Tue, 10 Feb 2015 17:16:49 -0800 To: FreeBSD PowerPC ML Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) X-Mailer: Apple Mail (2.2070.6) X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Feb 2015 01:17:00 -0000 Context: Unfortunately this takes me a bit to describe... powerpc 64 FreeBSD 10.1-??? variants on a PowerMac G5 Quad-Core, built = on the same machine. I expect the issue applies to some plain powerpc = contexts as well as some other powerpc64 contexts. As example context = where my issue occurs is: > 10.1-RELEASE-p5 > 10.1-RELEASE-p5 > FreeBSD FBSDG5M1 10.1-RELEASE-p5 FreeBSD 10.1-RELEASE-p5 #0 r277808M: = Fri Jan 30 00:58:33 PST 2015 = root@FBSDG5M1:/usr/obj/usr/home/markmi/src_10_1_releng/sys/GENERIC64vtsc = powerpc But I also get is for various vintages of 10.1-STABLE (and = 11.0-CURRENT). I use 10.1-RELEASE-p5 here because I happen to have a = build that avoids the problem and I know what to set for that build to = regenerated --and I know at least one thing to to turn on for builds to = create the problem. > root@FBSDG5M1:/usr/home/markmi/src_10_1_releng # more = sys/powerpc/conf/GENERIC64vtsc=20 > include GENERIC64 > ident GENERIC64vtsc >=20 > nooptions PS3 #Sony Playstation 3 = HACK!!! to allow sc >=20 > options DDB # HACK!!! to dump early crash = info (but 11.0-CURRENT already has it) > options GDB # HACK!!! ... > options VERBOSE_SYSINIT # VERBOSE_SYSINT blocks direct = booting for my 10.1-RELEASE-p5 variants: Crashes when the loader is in = __syncicache doing dcbst's. > options BOOTVERBOSE=3D1 > options BOOTHOWTO=3DRB_VERBOSE > #options KTR > #options KTR_MASK=3DKTR_TRAP > #options KTR_CPUMASK=3D0xF > #options KTR_VERBOSE >=20 > # HACK!!! to allow sc for 2560x1440 display on Radeon X1950 that vt = historically mishandled during booting > device sc > #device kbdmux # HACK: already listed by vt > options SC_OFWFB # OFW frame buffer > options SC_DFLT_FONT # compile font in > makeoptions SC_DFLT_FONT=3Dcp437 >=20 >=20 > # Disable extra checking typically used for FreeBSD 11.0-CURRENT: > nooptions DEADLKRES #Enable the deadlock resolver > nooptions INVARIANTS #Enable calls of extra sanity = checking > nooptions INVARIANT_SUPPORT #Extra sanity checks of = internal structures, required by INVARIANTS > nooptions WITNESS #Enable checks to detect = deadlocks and cycles > nooptions WITNESS_SKIPSPIN #Don't run witness on = spinlocks for speed > nooptions MALLOC_DEBUG_MAXZONES # Separate malloc(9) zones For my temporarily extended ELF_VERBOSE code [and other printf's] that = also reports on non-PT_LOADs (which are otherwise skipped) what it = reports for booting various 10.1-??? kernel builds is the sequence: PT_PHDR PT_INTERP PT_LOAD (for .text) (using archsw.arch_copyin then kern_pread) Address range example: 0x100000-0xbe017b PT_LOAD (for .data) (using kern_pread) Address range for the same example: 0xbf0000-0xea4b7f PT_DYNAMIC PT_GNU_STACK symtab strtab Final address for the same example: 0x1114baf The issue happens when there are such unreferenced pages where I = indicated. It turns out for what I started this investigation with that = if I commented out VERBOSE_SYSINIT in GENERIC64vtsc (listed earlier) = then no unreferenced pages appear but with VERBOSE_SYSINT there are such = pages (holding the rest of the context constant). But this is not the = only way to get such unreferenced pages. For example my 10.1-STABLE = build has unreferenced pages but does not have VERBOSE_SYSINIT (yet). When there are unreferenced pages between the two PT_LOADs those pages = do not get archsw_arch_copyin or kern_pread handling. (kern_pread in = turn uses archsw.arch_readin.) For my PowerMac G5 Quad-Core context those archsw.arch_ routines end = up being ofw_copyin and ofw_readin. Those routines in turn call = ofw_memmap which includes doing: > if (OF_call_method("claim", memory, 3, 1, destp, dlen, 0, = &addr) > =3D=3D -1) { > printf("ofw_mapmem: physical claim failed\n"); > return (ENOMEM); > } > =20 > /* > * We only do virtual memory management when real_mode is = false. > */ > if (real_mode =3D=3D 0) { > if (OF_call_method("claim", mmu, 3, 1, destp, dlen, 0, = &addr) > =3D=3D -1) { > printf("ofw_mapmem: virtual claim failed\n"); > return (ENOMEM); > } >=20 > if (OF_call_method("map", mmu, 4, 0, destp, destp, = dlen, 0) > =3D=3D -1) { > printf("ofw_mapmem: map failed\n"); > return (ENOMEM); > } > } and during load-time this is what programs the PowerPC to have the PTEG = entries (and whatever else) that instructions like dcbst require (since = MSR[DR]=3D1). The crashes are at the first dcbst in __syncicache execution that = reference the missing pages. (It seems unlikely that there is any other = usage of those pages.) The crash reports missing PTEG entries (DSISR for = IV 0x300). (Apple's openfirmware word .registers shows the recorded = register status from the crash. After the crash the PowerMac is in = Apple's context, not FreeBSD's.) The __syncicache use results from the following > int > ppc64_ofw_elf_loadfile(char *filename, u_int64_t dest, > struct preloaded_file **result) > { > int r; >=20 > r =3D __elfN(loadfile)(filename, dest, result); > if (r !=3D 0) > return (r); >=20 > /* > * No need to sync the icache for modules: this will > * be done by the kernel after relocation. > */ > if (!strcmp((*result)->f_type, "elf kernel")) > __syncicache((void *) (*result)->f_addr, = (*result)->f_size); > return (0); > } (powerpc has a similar sequence with __syncicache as I remember.) For = some reason the __syncicache usage is set up to span into or beyond the = .data segment, not just the .text one. I do not know why. __elfN(loadfile)'s interface is not designed to return multiple address = ranges and is returning one range that spans into both the PT_LOAD = ranges (.text and .data) and any unreferenced pages that are between = them. (In fact it spans even more afterwards as I remember.) Questions: Anyone have a clue about why the __syncicache use is set up to span into = .data (and more) and not just span .text --and willing to explain a = little? As far as solution directions go: this looks like a subject area = appropriate to general FreeBSD use base on the available evidence. A = local personal hack does not seem appropriate. So... A) Should the link of the kernel be producing a kernel with unreferenced = pages between the two PT_LOADs (between .text and .data)? Is the proper = fix to prevent those pages from existing in linked kernels? vs. B) Is it okay for those unreferenced pages to be there between the two = PT_LOADs? If yes... B1) Should something like the ofw_memmap activity be forced on those = otherwise unreferenced pages so that the later __syncicache use can stay = as it is? vs. B2) Should the unreferenced pages be skipped by making separate = __synicache calls for each PT_LOAD (.text segment and then .data segment = and beyond(?))? vs. B3) Should only the .text segment be spanned by the __syncicache use? = Some other more specific range that avoids those unreferenced pages? It would appear that all but (A) involve changing the interface provided = by __elfN(loadfile) and/or the interfaces it uses: the fix does not = appear well localized. (A) may have its own such issues but in other = code or files that I've not looked at. =3D=3D=3D Mark Millard markmi at dsl-only.net