From nobody Fri May 13 23:57:33 2022 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 511AB1AE1000 for ; Fri, 13 May 2022 23:57:49 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4L0QZ43Ft8z4QsR for ; Fri, 13 May 2022 23:57:48 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.16.1/8.16.1) with ESMTPS id 24DNvXc1027138 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Sat, 14 May 2022 02:57:36 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 24DNvXc1027138 Received: (from kostik@localhost) by tom.home (8.16.1/8.16.1/Submit) id 24DNvXIC027137; Sat, 14 May 2022 02:57:33 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 14 May 2022 02:57:33 +0300 From: Konstantin Belousov To: obiwac Cc: freebsd-current@freebsd.org Subject: Re: rtld: Relocation from unversioned binary matches oldest version instead of "default" Message-ID: References: List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.5 X-Spam-Checker-Version: SpamAssassin 3.4.5 (2021-03-20) on tom.home X-Rspamd-Queue-Id: 4L0QZ43Ft8z4QsR X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=gmail.com (policy=none); spf=softfail (mx1.freebsd.org: 2001:470:d5e7:1::1 is neither permitted nor denied by domain of kostikbel@gmail.com) smtp.mailfrom=kostikbel@gmail.com X-Spamd-Result: default: False [-2.33 / 15.00]; RCVD_TLS_ALL(0.00)[]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; NEURAL_HAM_LONG(-1.00)[-0.998]; MIME_GOOD(-0.10)[text/plain]; HAS_XAW(0.00)[]; R_SPF_SOFTFAIL(0.00)[~all]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_HAM_SHORT(-0.33)[-0.333]; RCPT_COUNT_TWO(0.00)[2]; MLMMJ_DEST(0.00)[freebsd-current]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US]; RCVD_COUNT_TWO(0.00)[2]; FREEMAIL_ENVFROM(0.00)[gmail.com]; DMARC_POLICY_SOFTFAIL(0.10)[gmail.com : No valid SPF, No valid DKIM,none] X-ThisMailContainsUnwantedMimeParts: N On Fri, May 13, 2022 at 01:16:39PM +0200, obiwac wrote: > Wassup, > > This may not be strictly speaking a bug with rtld, but it sure is > weird/awkward behaviour considering the existing information I could > gather. > > In an unversioned shared object which references a symbol which has > multiple versions (e.g. readdir@@FBSD_1.5 & readdir@FBSD_1.0, found > with 'readelf -s /lib/libc.so.7 | grep readdir@'), the dynamic linker > always selects the oldest version (so readdir@FBSD_1.0 in this case). > But as I understand it from documents such as [1], shouldn't the > default symbol (readdir@@FBSD_1.5) be used instead? ("Default" means > "unhidden" in the context of rtld afaiu, i.e. a symbol where '!(versym > & VER_NDX_HIDDEN)'.) > > The code in question in rtld which exhibits this behaviour is in > 'libexec/rtld-elf/rtld.c:matched_symbol': > > /* > * If we are not called from dlsym (i.e. this is a normal > * relocation from unversioned binary, accept the symbol > * immediately if it happens to have first version after > * this shared object became versioned. Otherwise, if > * symbol is versioned and not hidden, remember it. If it > * is the only symbol with this name exported by the > * shared object, it will be returned as a match at the > * end of the function. If symbol is global (verndx < 2) > * accept it unconditionally. > */ > if ((req->flags & SYMLOOK_DLSYM) == 0 && verndx == VER_NDX_GIVEN) { > result->sym_out = symp; > return (true); > } > else if (verndx >= VER_NDX_GIVEN) { > if ((versym & VER_NDX_HIDDEN) == 0) { > if (result->vsymp == NULL) result->vsymp = symp; > result->vcount++; > } > return (false); > } > > I imagine the intention behind this is to not break older unversioned > shared objects if the default symbol for a certain function it uses is > updated while the older version is still provided, but it makes it > such that you're forced to provide a version for your symbols in newer > programs. > > This means the common method for creating shared objects for instance > is incorrect and yields difficult to debug errors, e.g. in the case of > readdir, where a new program will use the new 'dirent' structure, but > 'readdir' will be in reality relocated to 'freebsd11_readdir', which > assumes the use of 'freebsd11_dirent': > > % cc -g -fPIC -c lib.c -o lib.o > % ld -shared lib.o -o liblib.so > % readelf -sD liblib.so | grep readdir # shows readdir unversioned > 4: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND readdir > > Simple fix on the user's end would be to force 'liblib.so' to use > versioned symbols: > > % ld -shared lib.o -o liblib.so /lib/libc.so.7 > % readelf -sD liblib.so | grep readdir # shows readdir versioned > 4: 0000000000000000 0 FUNC GLOBAL DEFAULT UND readdir@FBSD_1.5 (3) > > But that's a bit awkward I feel, and I don't see anyone suggesting to > do such a thing. It is not awkward, it is intended. What you do is called underlinking and is controlled by --allow-shlib-undefined or similar ld(1) switch. It is generally considered a wrong to allow underlinking, for obvious reasons that the result is problematic. Base system turns on ld(1) error on unresolved symbol use in shared libraries, and I think it is a trend in most Linux distributions as well. In other words, underlinking is not recommended/avoided. The biggest problem from underlinking for users is that they could end up referencing non-existing symbols, or depend on the implementation details of the used library. For later, if you link library A which DT_NEEEDED library B and B provides symbol X, then it is quite common to use X. This breaks when A stops neededing B, breaking ABI. Another problem in your proposal is that the versioning inheritance does not work the way you described, there is no 'latest' version of the symbol. Imagine the following version script: VER_A { global: X; }; VER_B { global: X; } VER_A; VER_C { global: X; } VER_A; what is the 'latest' X there? It just the FreeBSD conventions that we enforce linear inheritance, but this is not true for third party version scripts. And the last thing, changing the rtld behaviour there would be very subtle ABI break. > > One other bit of weirdness is that LLVM equivalents to GNU tools (e.g. > 'llvm-objdump') don't seem to have/care about the notion of a > "default" version: > > % objdump -T /lib/libc.so.7 | grep readdir > 00000000000af200 g DF .text 00000000000000be FBSD_1.5 readdir_r > 00000000000af3b0 g DF .text 00000000000000ed (FBSD_1.0) readdir_r > 00000000000af1a0 g DF .text 0000000000000053 FBSD_1.5 readdir > 00000000000af2c0 g DF .text 00000000000000ed (FBSD_1.0) readdir > % llvm-objdump -T /lib/libc.so.7 | grep readdir > 00000000000af200 g DF .text 00000000000000be readdir_r > 00000000000af3b0 g DF .text 00000000000000ed readdir_r > 00000000000af1a0 g DF .text 0000000000000053 readdir > 00000000000af2c0 g DF .text 00000000000000ed readdir > > This difference in functionality is very frustratingly not mentioned > anywhere that I can find in llvm-objdump's documentation, but perhaps > this is an indication that unversioned binaries are deprecated and > should not be used at all going forward? I don't know, and it irks me > quite a bit I can't find any information about this. > > Currently I'm using a patched version of rtld which behaves the way I > understood it should, but I'm still asking here to clarify things, > because this stuff has given me quite a few questions and I can't seem > to find very many answers. > > Perhaps kib@ could help with this? > > [1]: https://people.freebsd.org/~deischen/symver/library_versioning.txt