From owner-freebsd-hackers@freebsd.org Mon Jun 10 18:25:01 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B547C15C5DC2 for ; Mon, 10 Jun 2019 18:25:01 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic314-20.consmr.mail.gq1.yahoo.com (sonic314-20.consmr.mail.gq1.yahoo.com [98.137.69.83]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 76B1C877AD for ; Mon, 10 Jun 2019 18:25:00 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: ixc0Ir8VM1lkLbDSjDH51AlwyXXiV1bxk0os4tSHGcpPmSHahFEFWAmQ_Fx88Vn ix42u6AVYQCe57DWPsRklEUtc5gWl4v7tX9V5EMLn1ZVPMQVO2WeLDuugScnM6LOtsAj9C7_5SCa 2RE6eaf3WfXYprfwemTVRv74o1tATXfJlSK3IoPz.es8IHIs0bBmz1TJLZQkikRffvT3YL8pqC.i kl40JeY4I09sOQ2iv13eXtAJoUmGMLYVEo2XxLL2XeksptrApGTNS1n8mlfXyDTyziad1GUg.n2p D6otcbWwgg2xVV3yz0uw0dEPu.pJuYlSnxKx6lkayCRTjbgBHJzBw0TcelVFGOPbjQshfMyl1q2d BHftyzD_b8ai7nyJpRRkf7aoG_hKdNoHzXRKqnXlQm.Nh8VlvgeIhwOvcmDC58aAekKcqryNon8o tDJAz3tW1n6pM19v2bDbiLLCUoI8u335PLR2NOsJ0JjuTrpdDkLEvfRnptbxpay54xFp.RDP84BC B8H8XoXFEnCuBXH.s3hS5WIwVq2FZTwYAIqY4OtJvLCLEjSj3jXqOvYNkH4XWoPj41MfcgjGye3D qmuJYLbavG5m1sdOBLQyeHsjjuFL2g2C1uV9t9jGZzIDRx9JzH2GUNfbbL7cY4spJ3tZm8Kovnwd utIQ06HLlyrp2MEW2ZIkixsswTJjXhutY5D9myQ03WbBB3JMFTXstJcSKZS._sQA7JL79YaeCXsy RdBYEyacsYMtdb4pedPaDpO9mcll_JRJY4.L_EeKafmGOunBYOYaOgHWuPzVf75vvrM4z7lgG_wP 9.oshaXeoSzc.AwI__xNXTJHXt3xmlM_eqhqYgQBLyGyhYyr9qYsL6YHOAObKhmy9jf8HFuVb8Qm WJ1fl7dIcsiA7DtWAHgVp2zNLFAToyrPbIWE41CnyzNWcc8dEo_hA9fM8toLRr1d4zLc98mkWsck 7Jx12JTsd3YYpBiyWEdQ7OQPVcqFjMurGLxnzandIePxPqhcXhFJrjB0AFc_jiEW4N3Pi8Pjd7x. y4N4Emtn_GbKqdSvmd2soOheznidhlTQVszdnZFmpMgSDe.5kBzn41.dnYQvY5mm..McRkS47_zZ HstyPNZHvbd42vQ3lshRI63G8qAuaVs3lvAHZKsIwlXnao9BgKZw8nOpD1..PO6t14jMFunnN3c_ x4kuQkMYomb_snPQPqo2Sg0Amj8eNuhzFI_u2Bh1.s.NAdXub45TEkWbY0HXgGCM3ydbPfVo- Received: from sonic.gate.mail.ne1.yahoo.com by sonic314.consmr.mail.gq1.yahoo.com with HTTP; Mon, 10 Jun 2019 18:24:58 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.113]) ([67.170.167.181]) by smtp431.mail.gq1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 5bec5dc4ee973db8ce5ae635e185b59f; Mon, 10 Jun 2019 18:24:55 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) Subject: Re: kern_execve using vm_page_zero_invalid but not vm_page_set_validclean to load /sbin/init ? From: Mark Millard In-Reply-To: Date: Mon, 10 Jun 2019 11:24:54 -0700 Cc: FreeBSD Hackers , freeBSD PowerPC ML , Alfredo Dal Ava Junior , Justin Hibbits Content-Transfer-Encoding: quoted-printable Message-Id: <4003198F-C11B-4587-910B-2001DC09F538@yahoo.com> References: <1464D960-A1D6-404A-BB10-E615E2D14C1D@yahoo.com> To: Conrad Meyer X-Mailer: Apple Mail (2.3445.104.11) X-Rspamd-Queue-Id: 76B1C877AD X-Spamd-Bar: ++ X-Spamd-Result: default: False [2.98 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCPT_COUNT_FIVE(0.00)[5]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; NEURAL_HAM_SHORT(-0.74)[-0.744,0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; SUBJECT_ENDS_QUESTION(1.00)[]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; NEURAL_SPAM_MEDIUM(0.59)[0.586,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.93)[0.935,0]; RCVD_IN_DNSWL_NONE(0.00)[83.69.137.98.list.dnswl.org : 127.0.5.0]; IP_SCORE(1.71)[ip: (6.91), ipnet: 98.137.64.0/21(0.95), asn: 36647(0.76), country: US(-0.06)]; RWL_MAILSPIKE_POSSIBLE(0.00)[83.69.137.98.rep.mailspike.net : 127.0.0.17] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jun 2019 18:25:01 -0000 [Looks like Conrad M. is partially confirming my trace of the issue is reasonable.] On 2019-Jun-10, at 07:37, Conrad Meyer wrote: > Hi Mark, >=20 > On Sun, Jun 9, 2019 at 11:17 PM Mark Millard via freebsd-hackers > wrote: >> ... >> vm_pager_get_pages uses vm_page_zero_invalid >> to "Zero out partially filled data". >>=20 >> But vm_page_zero_invalid does not zero every "invalid" >> byte but works in terms of units of DEV_BSIZE : >> ... >> The comment indicates that areas of "sub-DEV_BSIZE" >> should have been handled previously by >> vm_page_set_validclean . >=20 > Or another VM routine, yes (e.g., vm_page_set_valid_range). The valid > and dirty bitmasks in vm_page only have a single bit per DEV_BSIZE > region, so care must be taken when marking any sub-DEV_BSIZE region as > valid to zero out the rest of the DEV_BSIZE region. This is part of > the VM page contract. I'm not sure it's related to the BSS, though. Yea, I had written from what I'd seen in __elfN(load_section): QUOTE __elfN(load_section) uses vm_imgact_map_page to set up for its copyout. This appears to be how the FileSiz (not including .sbss or .bss) vs. MemSiz (including .sbss and .bss) is handled (attempted?). END QUOTE The copyout only copies through the last byte for filesz but the vm_imgact_map_page does not zero out all the bytes after that on that page: /* * We have to get the remaining bit of the file into the first = part * of the oversized map segment. This is normally because the = .data * segment in the file is extended to provide bss. It's a neat = idea * to try and save a page, but it's a pain in the behind to = implement. */ copy_len =3D filsz =3D=3D 0 ? 0 : (offset + filsz) - = trunc_page(offset + filsz); map_addr =3D trunc_page((vm_offset_t)vmaddr + filsz); map_len =3D round_page((vm_offset_t)vmaddr + memsz) - map_addr; . . . if (copy_len !=3D 0) { sf =3D vm_imgact_map_page(object, offset + filsz); if (sf =3D=3D NULL) return (EIO); /* send the page fragment to user space */ off =3D trunc_page(offset + filsz) - trunc_page(offset + = filsz); error =3D copyout((caddr_t)sf_buf_kva(sf) + off, (caddr_t)map_addr, copy_len); vm_imgact_unmap_page(sf); if (error !=3D 0) return (error); } I looked into the details of the DEV_BSIZE code after sending the original message and so realized that my provided example /sbin/init readelf material was a good example of the issue if I'd not missed something. >> So, if, say, char**environ ends up at the start of .sbss >> consistently, does environ always end up zeroed independently >> of FileSz for the PT_LOAD that spans them? >=20 > It is required to be zeroed, yes. If not, there is a bug. If FileSz > covers BSS, that's a bug in the linker. Either the trailing bytes of > the corresponding page in the executable should be zero (wasteful; on > amd64 ".comment" is packed in there instead), or the linker/loader > must zero them at initialization. I'm not familiar with the > particular details here, but if you are interested I would suggest > looking at __elfN(load_section) in sys/kern/imgact_elf.c. I had looked at it some, see the material around the earlier quote above. >> The following is not necessarily an example of problematical >> figures but is just for showing an example structure of what >> FileSiz covers vs. MemSiz for PT_LOAD's that involve .sbss >> and .bss : >> ... >=20 > Your 2nd LOAD phdr's FileSiz matches up exactly with Segment .sbss > Offset minus Segment .tdata Offset, i.e., none of the FileSiz > corresponds to the (s)bss regions. (Good! At least the static linker > part looks sane.) That said, the boundary is not page-aligned and the > section alignment requirement is much lower than page_size, so the > beginning of bss will share a file page with some data. Something > should zero it at image activation. And, so far, I've not found anything in _start or before that does zero any "sub-DEV_BSIZE" part after FileSz for the PT_LOAD in question. Thanks for checking my trace of the issue. It is good to have some confirmation that I'd not missed something. > (Tangent: sbss/bss probably do not need to be RWE on PPC! On amd64, > init has three LOAD segments rather than two: one for rodata (R), one > for .text, .init, etc (RX); and one for .data (RW).) Yea, the section header flags indicate just WA for .sbss and .bss (but WAX for .got). But such is more general: for example, the beginning of .rodata (not executable) shares the tail part of a page with .fini (executable) in the example. .got has executable code but is in the middle of sections that do not. For something like /sbin/init it is so small that the middle of a page can be the only part that is executable, as in the example. (It is not forced onto its own page.) The form of .got used is also writable: WAX for section header flags. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)