Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 10 Jun 2019 11:24:54 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        Conrad Meyer <cem@freebsd.org>
Cc:        FreeBSD Hackers <freebsd-hackers@freebsd.org>, freeBSD PowerPC ML <freebsd-ppc@freebsd.org>, Alfredo Dal Ava Junior <alfredo.junior@eldorado.org.br>, Justin Hibbits <jrh29@alumni.cwru.edu>
Subject:   Re: kern_execve using vm_page_zero_invalid but not vm_page_set_validclean to load /sbin/init ?
Message-ID:  <4003198F-C11B-4587-910B-2001DC09F538@yahoo.com>
In-Reply-To: <CAG6CVpV5FBHgOTgxEgRmP%2B46Vm7mxoPCPECDJiq3k=D4qZ8PCA@mail.gmail.com>
References:  <1464D960-A1D6-404A-BB10-E615E2D14C1D@yahoo.com> <CAG6CVpV5FBHgOTgxEgRmP%2B46Vm7mxoPCPECDJiq3k=D4qZ8PCA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
[Looks like Conrad M. is partially confirming my trace of the
issue is reasonable.]

On 2019-Jun-10, at 07:37, Conrad Meyer <cem@freebsd.org> wrote:

> Hi Mark,
>=20
> On Sun, Jun 9, 2019 at 11:17 PM Mark Millard via freebsd-hackers
> <freebsd-hackers@freebsd.org> wrote:
>> ...
>> vm_pager_get_pages uses vm_page_zero_invalid
>> to "Zero out partially filled data".
>>=20
>> But vm_page_zero_invalid does not zero every "invalid"
>> byte but works in terms of units of DEV_BSIZE :
>> ...
>> The comment indicates that areas of "sub-DEV_BSIZE"
>> should have been handled previously by
>> vm_page_set_validclean .
>=20
> Or another VM routine, yes (e.g., vm_page_set_valid_range).  The valid
> and dirty bitmasks in vm_page only have a single bit per DEV_BSIZE
> region, so care must be taken when marking any sub-DEV_BSIZE region as
> valid to zero out the rest of the DEV_BSIZE region.  This is part of
> the VM page contract.  I'm not sure it's related to the BSS, though.

Yea, I had written from what I'd seen in __elfN(load_section):

QUOTE
__elfN(load_section) uses vm_imgact_map_page
to set up for its copyout. This appears to be
how the FileSiz (not including .sbss or .bss)
vs. MemSiz (including .sbss and .bss) is
handled (attempted?).
END QUOTE

The copyout only copies through the last byte for filesz
but the vm_imgact_map_page does not zero out all the
bytes after that on that page:

        /*
         * We have to get the remaining bit of the file into the first =
part
         * of the oversized map segment.  This is normally because the =
.data
         * segment in the file is extended to provide bss.  It's a neat =
idea
         * to try and save a page, but it's a pain in the behind to =
implement.
         */
        copy_len =3D filsz =3D=3D 0 ? 0 : (offset + filsz) - =
trunc_page(offset +
            filsz);
        map_addr =3D trunc_page((vm_offset_t)vmaddr + filsz);
        map_len =3D round_page((vm_offset_t)vmaddr + memsz) - map_addr;
. . .
        if (copy_len !=3D 0) {
                sf =3D vm_imgact_map_page(object, offset + filsz);
                if (sf =3D=3D NULL)
                        return (EIO);

                /* send the page fragment to user space */
                off =3D trunc_page(offset + filsz) - trunc_page(offset + =
filsz);
                error =3D copyout((caddr_t)sf_buf_kva(sf) + off,
                    (caddr_t)map_addr, copy_len);
                vm_imgact_unmap_page(sf);
                if (error !=3D 0)
                        return (error);
        }

I looked into the details of the DEV_BSIZE code after sending
the original message and so realized that my provided example
/sbin/init readelf material was a good example of the issue
if I'd not missed something.

>> So, if, say, char**environ ends up at the start of .sbss
>> consistently, does environ always end up zeroed independently
>> of FileSz for the PT_LOAD that spans them?
>=20
> It is required to be zeroed, yes.  If not, there is a bug.  If FileSz
> covers BSS, that's a bug in the linker.  Either the trailing bytes of
> the corresponding page in the executable should be zero (wasteful; on
> amd64 ".comment" is packed in there instead), or the linker/loader
> must zero them at initialization.  I'm not familiar with the
> particular details here, but if you are interested I would suggest
> looking at __elfN(load_section) in sys/kern/imgact_elf.c.

I had looked at it some, see the material around the earlier quote
above.

>> The following is not necessarily an example of problematical
>> figures but is just for showing an example structure of what
>> FileSiz covers vs. MemSiz for PT_LOAD's that involve .sbss
>> and .bss :
>> ...
>=20
> Your 2nd LOAD phdr's FileSiz matches up exactly with Segment .sbss
> Offset minus Segment .tdata Offset, i.e., none of the FileSiz
> corresponds to the (s)bss regions.  (Good!  At least the static linker
> part looks sane.)  That said, the boundary is not page-aligned and the
> section alignment requirement is much lower than page_size, so the
> beginning of bss will share a file page with some data.  Something
> should zero it at image activation.

And, so far, I've not found anything in _start or before that does
zero any "sub-DEV_BSIZE" part after FileSz for the PT_LOAD in
question.

Thanks for checking my trace of the issue. It is good to have some
confirmation that I'd not missed something.

> (Tangent: sbss/bss probably do not need to be RWE on PPC!  On amd64,
> init has three LOAD segments rather than two: one for rodata (R), one
> for .text, .init, etc (RX); and one for .data (RW).)

Yea, the section header flags indicate just WA for .sbss and .bss (but
WAX for .got).

But such is more general: for example, the beginning of .rodata
(not executable) shares the tail part of a page with .fini
(executable) in the example. .got has executable code but is in
the middle of sections that do not. For something like /sbin/init it
is so small that the middle of a page can be the only part that is
executable, as in the example. (It is not forced onto its own page.)

The form of .got used is also writable: WAX for section header flags.

=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4003198F-C11B-4587-910B-2001DC09F538>