From nobody Mon Apr 15 19:55:22 2024 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VJHw63psJz5HD12 for ; Mon, 15 Apr 2024 19:55:34 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4VJHw61Th0z4SKh for ; Mon, 15 Apr 2024 19:55:33 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Authentication-Results: mx1.freebsd.org; none Received: from critter.freebsd.dk (unknown [192.168.55.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by phk.freebsd.dk (Postfix) with ESMTPS id B7F0D89293; Mon, 15 Apr 2024 19:55:24 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.18.1/8.16.1) with ESMTPS id 43FJtOKT083781 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Mon, 15 Apr 2024 19:55:24 GMT (envelope-from phk@critter.freebsd.dk) Received: (from phk@localhost) by critter.freebsd.dk (8.18.1/8.16.1/Submit) id 43FJtMnU083779; Mon, 15 Apr 2024 19:55:22 GMT (envelope-from phk) Message-Id: <202404151955.43FJtMnU083779@critter.freebsd.dk> To: Warner Losh cc: Shawn Webb , Jamie Landeg-Jones , FreeBSD Hackers Subject: Re: Question regarding crunchgen(1) binaries In-reply-to: From: "Poul-Henning Kamp" References: <202404150105.43F15VoL068210@donotpassgo.dyslexicfish.net> <7lmqszm7n35b5jitwvzagmlc2lecl6p3dhu2bnhri4unnjtlow@f5txrntbo7yw> List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <83777.1713210922.1@critter.freebsd.dk> Date: Mon, 15 Apr 2024 19:55:22 +0000 X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:1835, ipnet:130.225.0.0/16, country:EU] X-Rspamd-Queue-Id: 4VJHw61Th0z4SKh -------- Warner Losh writes: > Maybe start there to understand what "LTO" the security thing is doing and > why it's either wrong or violates an assumption in crunchgen that can be > fixed. Crunch binaries were invented 30 years ago, to make FreeBSD installation program fit on a single floppy disk. Note that the goal was saving disk-space rather than RAM. The "architecture" of crunchgen is to take a lot of programs, rename their main() and link them all together with a new main() which dispatches to the right program's main() based on argv[0] Statistically you save half a disk-allocation unit for each program which was nothing to sneeze at, but the real disk-space dividend comes from linking the resulting combi-program static. Because it is linked static, only those .o files which are referenced gets pulled in from the libraries, libm::j0.o only gets pulled in if you Bessel functions, which, countrary to rumours, sysinstall did not. (The goal of shared libraries is saving RAM: Everybody gets the complete library, but only one copy of it's code ever gets loaded.) But the real trick is actually not crunchgen, which was originally just a shell script, but rather crunchide(1). Crunchide(1) does unnatural acts to an objectfile's symboltabel, to get around the fact that all the programs have a function called "main" and that they litter the global symbol namespace with their private inter-file references. To make a crunched binary, the .o files for the individual programs are first "pre-linked" without libraries so that internal interfile references are resolved. Then crunchide changes all global symbols, except "main" to be local symbols, so that they become unavailable for symbol resolution in the final run of the linker. The "main" symbol is also renamed to a per-program name, something like "cp_main" for cp(1) etc. And then all the prelinked .o files, one per program, gets linked together with the "dispatch main" and this time with libraries. I see no reason why crunchgen cannot be done with Link Time Optimization, but somebody has to write the new crunchide(1), and I suspect it will have a tougher row to hoe, because pre-linking cannot be used to take care of the inter-program symbols. As I understand it LTO can also link with "normal libraries" so one option might be to only LTO the final linking step of the crunch process, treating all the programs as "normal libraries", but still getting LTO advantage internally in the libraries. Poul-Henning -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence.