Date: Wed, 12 Jun 2019 12:12:57 -0700 From: Mark Millard <marklmi@yahoo.com> To: FreeBSD Toolchain <freebsd-toolchain@freebsd.org>, FreeBSD Hackers <freebsd-hackers@freebsd.org>, freeBSD PowerPC ML <freebsd-ppc@freebsd.org>, Conrad Meyer <cem@freebsd.org> Cc: Alfredo Dal Ava Junior <alfredo.junior@eldorado.org.br>, Justin Hibbits <jrh29@alumni.cwru.edu> Subject: Re: kern_execve using vm_page_zero_invalid but not vm_page_set_validclean to load /sbin/init ? Message-ID: <CF4D6785-F512-4DE7-BF61-7C0CF5B6E099@yahoo.com> In-Reply-To: <D1093D97-C7B5-4370-9C75-507D1EB98D03@yahoo.com> References: <1464D960-A1D6-404A-BB10-E615E2D14C1D@yahoo.com> <CAG6CVpV5FBHgOTgxEgRmP%2B46Vm7mxoPCPECDJiq3k=D4qZ8PCA@mail.gmail.com> <4003198F-C11B-4587-910B-2001DC09F538@yahoo.com> <47E002B7-D4A1-4C4B-BFFD-D926263D895E@yahoo.com> <48148449-93B0-446C-AA28-F211FFAE1A8B@yahoo.com> <86F7C4C4-2BB6-40F0-B5D3-C80ECB4A97CF@yahoo.com> <D1093D97-C7B5-4370-9C75-507D1EB98D03@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[Looks to me like the ->valid mask only is used for the last page of the /sbin/init file, not based on the size and alignment of the data requested for the PT_LOAD.] On 2019-Jun-11, at 21:53, Mark Millard <marklmi at yahoo.com> wrote: > [The garbage after .got up to the page boundary is > .comment section strings. The context here is > targeting 32-bit powerpc via system-clang-8 and > devel/powerpc64-binutils for buildworld and > buildkernel . ] >=20 > On 2019-Jun-11, at 19:55, Mark Millard <marklmi at yahoo.com> wrote: >=20 >> [I have confirmed .sbss not being zero'd out and environ >> thereby starting out non-zero (garbage): a >> debug.minidump=3D0 style dump.] >>=20 >>> On 2019-Jun-10, at 16:19, Mark Millard <marklmi@yahoo.com> wrote: >>>=20 >>> . . . (omitted) . . . >>=20 >> I used debug.minidump=3D0 in /boot/loader.conf for >> cusing a dump for the crash and a libkvm modified >> enough for my working boot environment to allow me >> to examine the the memory-image bytes of such a dump, >> with libkvm used via /usr/local/bin/kgdb . (No support >> of automatically translating user-space addresses >> or other such.) >>=20 >> For the clang based debug buildworld and debug buildkernel >> context with /sbin/init having: >>=20 >> [16] .got PROGBITS 01956ccc 146ccc 000010 04 WAX = 0 0 4 >> [17] .sbss NOBITS 01956cdc 146cdc 0000b0 00 WA = 0 0 4 >> [18] .bss NOBITS 01956dc0 146cdc 02ee28 00 WA = 0 0 64 >>=20 >> I confirmed that .sbss in /sbin/init's address space >> is not zeroed (so environ is not assigned by handle_argv ). >> I also confirmed that _start was given a good env value >> (in %r5) based on where the value was stored on the >> stack. It is just that the value was not used. >>=20 >> The detailed obvious-failure point (crash) can change based >> on the garbage in the .sbss and, for the build that I used >> this time, that happened in __je_arean_malloc_hard instead >> of before _init_tls called _libc_allocate_tls . (I traced >> the call chain in the dump.) >>=20 >>=20 >> =46rom what I've seen in the dump there seem to be special >> uses of some values (that also have normal uses, of >> course): >>=20 >> 0xfa5005af: as yet invalid page content. >> 0x1c000020: as yet unassigned user-space-stack memory for /sbin/init. >>=20 >> These are the same locations that I previously reported as >> showing up in the DSI read trap reports for /sbin/init failing. >> The specific build here failed with a different value. >>=20 >> For reference relative to libkvm: >>=20 >> # svnlite diff /usr/src/lib/libkvm/ >> Index: /usr/src/lib/libkvm/kvm_powerpc.c >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> --- /usr/src/lib/libkvm/kvm_powerpc.c (revision 347549) >> +++ /usr/src/lib/libkvm/kvm_powerpc.c (working copy) >> @@ -211,6 +211,53 @@ >> if (be32toh(vm->ph->p_paddr) =3D=3D 0xffffffff) >> return ((int)powerpc_va2off(kd, va, ofs)); >>=20 >> + // HACK in something for what I observe in >> + // a debug.minidump=3D0 vmcore.* for 32-bit powerpc >> + // >> + if ( be32toh(vm->ph->p_vaddr) =3D=3D 0xffffffff >> + && be32toh(vm->ph->p_paddr) =3D=3D 0 >> + && be16toh(vm->eh->e_phnum) =3D=3D 1 >> + ) { >> + // Presumes p_memsz is either unsigned >> + // 32-bit or is 64-bit, same for va . >> + >> + if (be32toh(vm->ph->p_memsz) <=3D va) >> + return 0; // Like powerpc_va2off >> + >> + // If ofs was (signed) 32-bit there >> + // would be a problem for sufficiently >> + // large postive memsz's and va's >> + // near the end --because of p_offset >> + // and dmphdrsz causing overflow/wrapping >> + // for some large va values. >> + // Presumes 64-bit ofs for such cases. >> + // Also presumes dmphdrsz+p_offset >> + // is non-negative so that small >> + // non-negative va values have no >> + // problems with ofs going negative. >> + >> + *ofs =3D vm->dmphdrsz >> + + be32toh(vm->ph->p_offset) >> + + va; >> + >> + // The normal return value overflows/wraps >> + // for p_memsz =3D=3D 0x80000000u when va =3D=3D 0 . >> + // Avoid this by depending on calling code's >> + // loop for sufficiently large cases. >> + // This code presumes p_memsz/2 <=3D MAX_INT . >> + // 32-bit powerpc FreeBSD does not allow >> + // using more than 2 GiBytes of RAM but >> + // does allow using 2 GiBytes on 64-bit >> + // hardware. >> + // >> + if ( (int)be32toh(vm->ph->p_memsz) < 0 >> + && va < be32toh(vm->ph->p_memsz)/2 >> + ) >> + return be32toh(vm->ph->p_memsz)/2; >> + >> + return be32toh(vm->ph->p_memsz) - va; >> + } >> + >> _kvm_err(kd, kd->program, "Raw corefile not supported"); >> return (0); >> } >> Index: /usr/src/lib/libkvm/kvm_private.c >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> --- /usr/src/lib/libkvm/kvm_private.c (revision 347549) >> +++ /usr/src/lib/libkvm/kvm_private.c (working copy) >> @@ -131,7 +131,9 @@ >> { >>=20 >> return (kd->nlehdr.e_ident[EI_CLASS] =3D=3D class && >> - kd->nlehdr.e_type =3D=3D ET_EXEC && >> + ( kd->nlehdr.e_type =3D=3D ET_EXEC || >> + kd->nlehdr.e_type =3D=3D ET_DYN >> + ) && >> kd->nlehdr.e_machine =3D=3D machine); >> } >>=20 >>=20 >>=20 >=20 > The following is was is in the .sbss/.bss up to the page > boundry (after the .got bytes): >=20 > (kgdb) x/s 0x2a66cdc > 0x2a66cdc: "$FreeBSD: head/lib/csu/powerpc/crt1.c 326219 2017-11-26 = 02:00:33Z pfg $" >=20 > (kgdb) x/s 0x2a66d24 > 0x2a66d24: "$FreeBSD: head/lib/csu/common/crtbrand.c 340701 = 2018-11-20 20:59:49Z emaste $" >=20 > (kgdb) x/s 0x2a66d72 > 0x2a66d72: "$FreeBSD: head/lib/csu/common/ignore_init.c 340702 = 2018-11-20 21:04:20Z emaste $" >=20 > (kgdb) x/s 0x2a66dc3 > 0x2a66dc3: "FreeBSD clang version 8.0.0 (tags/RELEASE_800/final = 356365) (based on LLVM 8.0.0)" >=20 > (kgdb) x/s 0x2a66e15 > 0x2a66e15: "$FreeBSD: head/lib/csu/powerpc/crti.S 217399 2011-01-14 = 11:34:58Z kib $" >=20 > (kgdb) x/s 0x2a66e5d > 0x2a66e5d: "$FreeBSD: head/sbin/mount/getmntopts.c 326025 = 2017-11-20 19:49:47Z pfg $" >=20 > (kgdb) x/s 0x2a66ea6 > 0x2a66ea6: "$FreeBSD: head/lib/libutil/login_tty.c 334106 = 2018-05-23 17:02:12Z jhb $" >=20 > (kgdb) x/s 0x2a66eef > 0x2a66eef: "$FreeBSD: head/lib/libutil/login_class.c 296723 = 2016-03-12 14:54:34Z kib $" >=20 > (kgdb) x/s 0x2a66f83 > 0x2a66f83: "$FreeBSD: head/lib/libutil/_secure_path.c 139012 = 2004-12-18 12:31:12Z ru $" >=20 > (kgdb) x/s 0x2a66fce > 0x2a66fce: "$FreeBSD: head/lib/libcrypt/crypt.c 326219 2017-11 >=20 > (I truncated that last to avoid the 0xfa5005af's on the next page > in RAM.) >=20 > Compare ( from readelf /sbin/init ): >=20 > String dump of section '.comment': > [ 0] $FreeBSD: head/lib/csu/powerpc/crt1.c 326219 2017-11-26 = 02:00:33Z pfg $ > [ 48] $FreeBSD: head/lib/csu/common/crtbrand.c 340701 2018-11-20 = 20:59:49Z emaste $ > [ 96] $FreeBSD: head/lib/csu/common/ignore_init.c 340702 = 2018-11-20 21:04:20Z emaste $ > [ e7] FreeBSD clang version 8.0.0 (tags/RELEASE_800/final 356365) = (based on LLVM 8.0.0) > [ 139] $FreeBSD: head/lib/csu/powerpc/crti.S 217399 2011-01-14 = 11:34:58Z kib $ > [ 181] $FreeBSD: head/sbin/mount/getmntopts.c 326025 2017-11-20 = 19:49:47Z pfg $ > [ 1ca] $FreeBSD: head/lib/libutil/login_tty.c 334106 2018-05-23 = 17:02:12Z jhb $ > [ 213] $FreeBSD: head/lib/libutil/login_class.c 296723 2016-03-12 = 14:54:34Z kib $ > [ 25e] $FreeBSD: head/lib/libutil/login_cap.c 317265 2017-04-21 = 19:27:33Z pfg $ > [ 2a7] $FreeBSD: head/lib/libutil/_secure_path.c 139012 2004-12-18 = 12:31:12Z ru $ > [ 2f2] $FreeBSD: head/lib/libcrypt/crypt.c 326219 2017-11-26 = 02:00:33Z pfg $ > . . . >=20 > Note: >=20 > Program Headers: > Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg = Align > LOAD 0x000000 0x01800000 0x01800000 0x140ad4 0x140ad4 R E = 0x10000 > LOAD 0x140ae0 0x01950ae0 0x01950ae0 0x061fc 0x35108 RWE = 0x10000 > NOTE 0x0000d4 0x018000d4 0x018000d4 0x00048 0x00048 R 0x4 > TLS 0x140ae0 0x01950ae0 0x01950ae0 0x00b10 0x00b1d R = 0x10 > GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW = 0x10 >=20 > Section to Segment mapping: > Segment Sections... > 00 .note.tag .init .text .fini .rodata .eh_frame=20 > 01 .tdata .tbss .init_array .fini_array .ctors .dtors .jcr = .data.rel.ro .data .got .sbss .bss=20 > 02 .note.tag=20 > 03 .tdata .tbss=20 > 04 =20 > There are 24 section headers, starting at offset 0x16cec8: >=20 > Section Headers: > [Nr] Name Type Addr Off Size ES Flg = Lk Inf Al > . . . > [16] .got PROGBITS 01956ccc 146ccc 000010 04 WAX = 0 0 4 > [17] .sbss NOBITS 01956cdc 146cdc 0000b0 00 WA = 0 0 4 > [18] .bss NOBITS 01956dc0 146cdc 02ee28 00 WA = 0 0 64 > [19] .comment PROGBITS 00000000 146cdc 0073d4 01 MS = 0 0 1 >=20 > It looks like material after the .got is being copied, > spanning the in-file-empty .sbss and .bss sections and > implicitly initializing (the first part of) those > sections. The ->valid assignments appears to trace to code like: /* * The last page has valid blocks. Invalid part can only * exist at the end of file, and the page is made fully valid * by zeroing in vm_pager_get_pages(). */ if (m[count - 1]->valid !=3D 0 && --count =3D=3D 0) { if (iodone !=3D NULL) iodone(arg, m, 1, 0); return (VM_PAGER_OK); } independent of if the requested data does not span into the last page but does not span to the end of a page. So it appears that the use of: QUOTE vm_imgact_map_page uses vm_imgact_hold_page. vm_imgact_hold_page uses vm_pager_get_pages. vm_pager_get_pages uses vm_page_zero_invalid to "Zero out partially filled data" END QUOTE simply does not do the right thing for .sbss or .bss handling. The m->valid related code for zeroing is basically irrelevant to .sbss and .bss. Note that the below code requires a m->valid bit to be asserted in order to do any pmap_zero_page_area operations. Thus it does not zero out pages that are completely invalid either. This explains why I see 0xfa5005af on the full pages in the .sbss/.bss area for debug builds: nothing is zeroing the full pages either. void vm_page_zero_invalid(vm_page_t m, boolean_t setvalid) { int b; int i; VM_OBJECT_ASSERT_WLOCKED(m->object); /* * Scan the valid bits looking for invalid sections that * must be zeroed. Invalid sub-DEV_BSIZE'd areas ( where the * valid bit may be set ) have already been zeroed by * vm_page_set_validclean(). */ for (b =3D i =3D 0; i <=3D PAGE_SIZE / DEV_BSIZE; ++i) { if (i =3D=3D (PAGE_SIZE / DEV_BSIZE) || (m->valid & ((vm_page_bits_t)1 << i))) { if (i > b) { pmap_zero_page_area(m, b << DEV_BSHIFT, (i - b) << = DEV_BSHIFT); } b =3D i + 1; } } /* * setvalid is TRUE when we can safely set the zero'd areas * as being valid. We can do this if there are no cache = consistancy * issues. e.g. it is ok to do with UFS, but not ok to do with = NFS. */ if (setvalid) m->valid =3D VM_PAGE_BITS_ALL; } This code simply does not do the right thing for .sbss and .bss handling. __start in /sbin/init (for example) expects .sbss and .bss to have already been initialized to zero (and possibly further adjusted after that for something like environ). So far I find nothing to cover that. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CF4D6785-F512-4DE7-BF61-7C0CF5B6E099>