Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 10 Feb 2015 17:16:49 -0800
From:      Mark Millard <markmi@dsl-only.net>
To:        FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>
Subject:   Fixing powerpc64 /boot/loader's kernel page handing: suggestions?
Message-ID:  <B756B0BB-D15F-4E4B-8A61-93EEBA7BD464@dsl-only.net>

next in thread | raw e-mail | index | archive | help
Context:

Unfortunately this takes me a bit to describe...

powerpc 64 FreeBSD 10.1-??? variants on a PowerMac G5 Quad-Core, built =
on the same machine. I expect the issue applies to some plain powerpc =
contexts as well as some other powerpc64 contexts. As example context =
where my issue occurs is:

> 10.1-RELEASE-p5
> 10.1-RELEASE-p5
> FreeBSD FBSDG5M1 10.1-RELEASE-p5 FreeBSD 10.1-RELEASE-p5 #0 r277808M: =
Fri Jan 30 00:58:33 PST 2015     =
root@FBSDG5M1:/usr/obj/usr/home/markmi/src_10_1_releng/sys/GENERIC64vtsc =
 powerpc

But I also get is for various vintages of 10.1-STABLE (and =
11.0-CURRENT). I use 10.1-RELEASE-p5 here because I happen to have a =
build that avoids the problem and I know what to set for that build to =
regenerated --and I know at least one thing to to turn on for builds to =
create the problem.

> root@FBSDG5M1:/usr/home/markmi/src_10_1_releng # more =
sys/powerpc/conf/GENERIC64vtsc=20
> include GENERIC64
> ident   GENERIC64vtsc
>=20
> nooptions       PS3                     #Sony Playstation 3            =
   HACK!!! to allow sc
>=20
> options         DDB                     # HACK!!! to dump early crash =
info (but 11.0-CURRENT already has it)
> options         GDB                     # HACK!!! ...
> options         VERBOSE_SYSINIT         # VERBOSE_SYSINT blocks direct =
booting for my 10.1-RELEASE-p5 variants: Crashes when the loader is in =
__syncicache doing dcbst's.
> options         BOOTVERBOSE=3D1
> options         BOOTHOWTO=3DRB_VERBOSE
> #options        KTR
> #options        KTR_MASK=3DKTR_TRAP
> #options        KTR_CPUMASK=3D0xF
> #options        KTR_VERBOSE
>=20
> # HACK!!! to allow sc for 2560x1440 display on Radeon X1950 that vt =
historically mishandled during booting
> device          sc
> #device          kbdmux         # HACK: already listed by vt
> options         SC_OFWFB        # OFW frame buffer
> options         SC_DFLT_FONT    # compile font in
> makeoptions     SC_DFLT_FONT=3Dcp437
>=20
>=20
> # Disable extra checking typically used for FreeBSD 11.0-CURRENT:
> nooptions       DEADLKRES               #Enable the deadlock resolver
> nooptions       INVARIANTS              #Enable calls of extra sanity =
checking
> nooptions       INVARIANT_SUPPORT       #Extra sanity checks of =
internal structures, required by INVARIANTS
> nooptions       WITNESS                 #Enable checks to detect =
deadlocks and cycles
> nooptions       WITNESS_SKIPSPIN        #Don't run witness on =
spinlocks for speed
> nooptions       MALLOC_DEBUG_MAXZONES   # Separate malloc(9) zones


For my temporarily extended ELF_VERBOSE code [and other printf's] that =
also reports on non-PT_LOADs (which are otherwise skipped) what it =
reports for booting various 10.1-??? kernel builds is the sequence:

PT_PHDR
PT_INTERP
PT_LOAD (for .text)
    (using archsw.arch_copyin then kern_pread)
    Address range example: 0x100000-0xbe017b
<note: some builds have unreferenced pages between the 2 PT_LOADs>
PT_LOAD (for .data)
    (using kern_pread)
    Address range for the same example: 0xbf0000-0xea4b7f
PT_DYNAMIC
PT_GNU_STACK
symtab
strtab
    Final address for the same example: 0x1114baf

The issue happens when there are such unreferenced pages where I =
indicated. It turns out for what I started this investigation with that =
if I commented out VERBOSE_SYSINIT in GENERIC64vtsc (listed earlier) =
then no unreferenced pages appear but with VERBOSE_SYSINT there are such =
pages (holding the rest of the context constant). But this is not the =
only way to get such unreferenced pages. For example my 10.1-STABLE =
build has unreferenced pages but does not have VERBOSE_SYSINIT (yet).

When there are unreferenced pages between the two PT_LOADs those pages =
do not get archsw_arch_copyin or kern_pread handling. (kern_pread in =
turn uses archsw.arch_readin.)

For my PowerMac G5 Quad-Core context those archsw.arch_<?> routines end =
up being ofw_copyin and ofw_readin. Those routines in turn call =
ofw_memmap which includes doing:

>         if (OF_call_method("claim", memory, 3, 1, destp, dlen, 0, =
&addr)
>             =3D=3D -1) {
>                 printf("ofw_mapmem: physical claim failed\n");
>                 return (ENOMEM);
>         }
>         =20
>         /*
>          * We only do virtual memory management when real_mode is =
false.
>          */
>         if (real_mode =3D=3D 0) {
>                 if (OF_call_method("claim", mmu, 3, 1, destp, dlen, 0, =
&addr)
>                     =3D=3D -1) {
>                         printf("ofw_mapmem: virtual claim failed\n");
>                         return (ENOMEM);
>                 }
>=20
>                 if (OF_call_method("map", mmu, 4, 0, destp, destp, =
dlen, 0)
>                     =3D=3D -1) {
>                         printf("ofw_mapmem: map failed\n");
>                         return (ENOMEM);
>                 }
>         }

and during load-time this is what programs the PowerPC to have the PTEG =
entries (and whatever else) that instructions like dcbst require (since =
MSR[DR]=3D1).

The crashes are at the first dcbst in __syncicache execution that =
reference the missing pages. (It seems unlikely that there is any other =
usage of those pages.) The crash reports missing PTEG entries (DSISR for =
IV 0x300). (Apple's openfirmware word .registers shows the recorded =
register status from the crash. After the crash the PowerMac is in =
Apple's context, not FreeBSD's.)

The __syncicache use results from the following

> int
> ppc64_ofw_elf_loadfile(char *filename, u_int64_t dest,
>     struct preloaded_file **result)
> {
>         int     r;
>=20
>         r =3D __elfN(loadfile)(filename, dest, result);
>         if (r !=3D 0)
>                 return (r);
>=20
>         /*
>          * No need to sync the icache for modules: this will
>          * be done by the kernel after relocation.
>          */
>         if (!strcmp((*result)->f_type, "elf kernel"))
>                 __syncicache((void *) (*result)->f_addr, =
(*result)->f_size);
>         return (0);
> }

(powerpc has a similar sequence with __syncicache as I remember.) For =
some reason the __syncicache usage is set up to span into or beyond the =
.data segment, not just the .text one. I do not know why.

__elfN(loadfile)'s interface is not designed to return multiple address =
ranges and is returning one range that spans into both the PT_LOAD =
ranges (.text and .data) and any unreferenced pages that are between =
them. (In fact it spans even more afterwards as I remember.)


Questions:

Anyone have a clue about why the __syncicache use is set up to span into =
.data (and more) and not just span .text --and willing to explain a =
little?


As far as solution directions go: this looks like a subject area =
appropriate to general FreeBSD use base on the available evidence. A =
local personal hack does not seem appropriate. So...


A) Should the link of the kernel be producing a kernel with unreferenced =
pages between the two PT_LOADs (between .text and .data)? Is the proper =
fix to prevent those pages from existing in linked kernels?

vs.

B) Is it okay for those unreferenced pages to be there between the two =
PT_LOADs? If yes...

B1) Should something like the ofw_memmap activity be forced on those =
otherwise unreferenced pages so that the later __syncicache use can stay =
as it is?

vs.

B2) Should the unreferenced pages be skipped by making separate =
__synicache calls for each PT_LOAD (.text segment and then .data segment =
and beyond(?))?

vs.

B3) Should only the .text segment be spanned by the __syncicache use? =
Some other more specific range that avoids those unreferenced pages?


It would appear that all but (A) involve changing the interface provided =
by __elfN(loadfile) and/or the interfaces it uses: the fix does not =
appear well localized. (A) may have its own such issues but in other =
code or files that I've not looked at.


=3D=3D=3D
Mark Millard
markmi at dsl-only.net




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B756B0BB-D15F-4E4B-8A61-93EEBA7BD464>