Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 14 May 2022 02:57:33 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        obiwac <obiwac@gmail.com>
Cc:        freebsd-current@freebsd.org
Subject:   Re: rtld: Relocation from unversioned binary matches oldest version instead of "default"
Message-ID:  <Yn7wbTHqOAQ6xzZr@kib.kiev.ua>
In-Reply-To: <CAN8-kNUuHirfcSLF-BhG4sKyzWHgfxr8QTe3dAtT=jf9SohpQw@mail.gmail.com>
References:  <CAN8-kNUuHirfcSLF-BhG4sKyzWHgfxr8QTe3dAtT=jf9SohpQw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, May 13, 2022 at 01:16:39PM +0200, obiwac wrote:
> Wassup,
> 
> This may not be strictly speaking a bug with rtld, but it sure is
> weird/awkward behaviour considering the existing information I could
> gather.
> 
> In an unversioned shared object which references a symbol which has
> multiple versions (e.g. readdir@@FBSD_1.5 & readdir@FBSD_1.0, found
> with 'readelf -s /lib/libc.so.7 | grep readdir@'), the dynamic linker
> always selects the oldest version (so readdir@FBSD_1.0 in this case).
> But as I understand it from documents such as [1], shouldn't the
> default symbol (readdir@@FBSD_1.5) be used instead? ("Default" means
> "unhidden" in the context of rtld afaiu, i.e. a symbol where '!(versym
> & VER_NDX_HIDDEN)'.)
> 
> The code in question in rtld which exhibits this behaviour is in
> 'libexec/rtld-elf/rtld.c:matched_symbol':
> 
>     /*
>      * If we are not called from dlsym (i.e. this is a normal
>      * relocation from unversioned binary, accept the symbol
>      * immediately if it happens to have first version after
>      * this shared object became versioned. Otherwise, if
>      * symbol is versioned and not hidden, remember it. If it
>      * is the only symbol with this name exported by the
>      * shared object, it will be returned as a match at the
>      * end of the function. If symbol is global (verndx < 2)
>      * accept it unconditionally.
>      */
>     if ((req->flags & SYMLOOK_DLSYM) == 0 && verndx == VER_NDX_GIVEN) {
>         result->sym_out = symp;
>         return (true);
>     }
>     else if (verndx >= VER_NDX_GIVEN) {
>         if ((versym & VER_NDX_HIDDEN) == 0) {
>             if (result->vsymp == NULL) result->vsymp = symp;
>             result->vcount++;
>         }
>         return (false);
>     }
> 
> I imagine the intention behind this is to not break older unversioned
> shared objects if the default symbol for a certain function it uses is
> updated while the older version is still provided, but it makes it
> such that you're forced to provide a version for your symbols in newer
> programs.
> 
> This means the common method for creating shared objects for instance
> is incorrect and yields difficult to debug errors, e.g. in the case of
> readdir, where a new program will use the new 'dirent' structure, but
> 'readdir' will be in reality relocated to 'freebsd11_readdir', which
> assumes the use of 'freebsd11_dirent':
> 
>     % cc -g -fPIC -c lib.c -o lib.o
>     % ld -shared lib.o -o liblib.so
>     % readelf -sD liblib.so | grep readdir # shows readdir unversioned
>     4: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND readdir
> 
> Simple fix on the user's end would be to force 'liblib.so' to use
> versioned symbols:
> 
>     % ld -shared lib.o -o liblib.so /lib/libc.so.7
>     % readelf -sD liblib.so | grep readdir # shows readdir versioned
>      4: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND readdir@FBSD_1.5 (3)
> 
> But that's a bit awkward I feel, and I don't see anyone suggesting to
> do such a thing.
It is not awkward, it is intended.  What you do is called underlinking
and is controlled by --allow-shlib-undefined or similar ld(1) switch.
It is generally considered a wrong to allow underlinking, for obvious
reasons that the result is problematic.

Base system turns on ld(1) error on unresolved symbol use in shared libraries,
and I think it is a trend in most Linux distributions as well.  In other
words, underlinking is not recommended/avoided.

The biggest problem from underlinking for users is that they could end
up referencing non-existing symbols, or depend on the implementation
details of the used library. For later, if you link library A which
DT_NEEEDED library B and B provides symbol X, then it is quite common to
use X. This breaks when A stops neededing B, breaking ABI.

Another problem in your proposal is that the versioning inheritance
does not work the way you described, there is no 'latest' version of
the symbol.  Imagine the following version script:
VER_A {
	global: X;
};

VER_B {
	global: X;
} VER_A;

VER_C {
	global: X;
} VER_A;

what is the 'latest' X there?
It just the FreeBSD conventions that we enforce linear inheritance,
but this is not true for third party version scripts.

And the last thing, changing the rtld behaviour there would be very
subtle ABI break.

> 
> One other bit of weirdness is that LLVM equivalents to GNU tools (e.g.
> 'llvm-objdump') don't seem to have/care about the notion of a
> "default" version:
> 
>     % objdump -T /lib/libc.so.7 | grep readdir
>     00000000000af200 g    DF .text    00000000000000be  FBSD_1.5    readdir_r
>     00000000000af3b0 g    DF .text    00000000000000ed (FBSD_1.0)   readdir_r
>     00000000000af1a0 g    DF .text    0000000000000053  FBSD_1.5    readdir
>     00000000000af2c0 g    DF .text    00000000000000ed (FBSD_1.0)   readdir
>     % llvm-objdump -T /lib/libc.so.7 | grep readdir
>     00000000000af200 g    DF .text    00000000000000be readdir_r
>     00000000000af3b0 g    DF .text    00000000000000ed     readdir_r
>     00000000000af1a0 g    DF .text    0000000000000053 readdir
>     00000000000af2c0 g    DF .text    00000000000000ed readdir
> 
> This difference in functionality is very frustratingly not mentioned
> anywhere that I can find in llvm-objdump's documentation, but perhaps
> this is an indication that unversioned binaries are deprecated and
> should not be used at all going forward? I don't know, and it irks me
> quite a bit I can't find any information about this.
> 
> Currently I'm using a patched version of rtld which behaves the way I
> understood it should, but I'm still asking here to clarify things,
> because this stuff has given me quite a few questions and I can't seem
> to find very many answers.
> 
> Perhaps kib@ could help with this?
> 
> [1]: https://people.freebsd.org/~deischen/symver/library_versioning.txt



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Yn7wbTHqOAQ6xzZr>