Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 17 Jul 2015 14:24:33 +1000
From:      Jan Mikkelsen <janm@transactionware.com>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        FreeBSD Stable Mailing List <freebsd-stable@freebsd.org>, Karl Denninger <karl@denninger.net>
Subject:   Re: amd64 kernel dynamic linking allows extern references to statics
Message-ID:  <5418A15D-C125-4F63-B6DC-3521C0EC3B19@transactionware.com>
In-Reply-To: <20150716130148.GM2404@kib.kiev.ua>
References:  <13C52D9D-E1E6-4F83-A881-4E867C336B31@transactionware.com> <20150715132744.GD2404@kib.kiev.ua> <C3196483-BFE4-4F65-A685-0FCF23693391@transactionware.com> <20150716130148.GM2404@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help

> On 16 Jul 2015, at 23:01, Konstantin Belousov <kostikbel@gmail.com> =
wrote:
>=20
> On Thu, Jul 16, 2015 at 09:18:15AM +1000, Jan Mikkelsen wrote:
>>=20
>>> On 15 Jul 2015, at 11:27 pm, Konstantin Belousov =
<kostikbel@gmail.com> wrote:
>>>=20
>>> On Wed, Jul 15, 2015 at 06:17:20PM +1000, Jan Mikkelsen wrote:
>>>> Hi,
>>>>=20
>>>> (All on 10.2-BETA1.)
>>>>=20
>>>> I noticed that the latest patch in the bug =
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D187594 =
<https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D187594>; works on =
amd64 but fails to load zfs.ko on i386 with a symbol not found error.
>>>>=20
>>>> Looking at the patch, there is one file that has ???extern int =
zio_use_uma??? to reference a variable that is declared elsewhere as =
???static int zio_use_uma???. To me this obviously should not work. =
However it does work on amd64 but fails on i386.
>>>>=20
>>>> Below is a small test case that reproduces the problem. The =
generated kernel module loads on amd64 but fails on i386. On amd64 one =
compilation unit is accessing a static in from another compilation unit =
by declaring the variable ???extern???.=20
>>>>=20
>>>> I haven???t looked further to attempt to find the bug. However, it =
looks like a Bad Thing??? to me.
>>>>=20
>>>=20
>>> I am not sure that this is fixable.  Issue is that amd64 modules are
>>> relinked object files, and they might have unresolved relocations =
against
>>> local symbols.  Change like the following probably fix your test =
case,
>>> but also quite possible would break legitimate local references.
>>>=20
>>> diff --git a/sys/kern/link_elf_obj.c b/sys/kern/link_elf_obj.c
>>> index 021381d..6fa5276 100644
>>> --- a/sys/kern/link_elf_obj.c
>>> +++ b/sys/kern/link_elf_obj.c
>>> @@ -1096,7 +1096,8 @@ link_elf_lookup_symbol(linker_file_t lf, const =
char *name, c_linker_sym_t *sym)
>>>=20
>>> 	for (i =3D 0, symp =3D ef->ddbsymtab; i < ef->ddbsymcnt; i++, =
symp++) {
>>> 		strp =3D ef->ddbstrtab + symp->st_name;
>>> -		if (symp->st_shndx !=3D SHN_UNDEF && strcmp(name, strp) =
=3D=3D 0) {
>>> +		if (symp->st_shndx !=3D SHN_UNDEF && strcmp(name, strp) =
=3D=3D 0 &&
>>> +		    ELF_ST_BIND(symp->st_info) !=3D STB_LOCAL) {
>>> 			*sym =3D (c_linker_sym_t) symp;
>>> 			return 0;
>>> 		}
>>=20
>> I don???t know why there could be an unresolved relocation against a =
local symbol. My (admittedly trivial) tests with nm and static =
functions/variables have not led to a ???U??? record. Under what =
circumstances would this happen?
>=20
> Assembler cannot know the final location for any non-local data, so
> any variable, global or static, requires a relocation in the object =
file
> for the final code to work properly.  Look at the most trivial example
>=20
> static int a;
> int
> f(int x)
> {
> 	return (a + x);
> }
> int *g(void)
> {
> 	return (&a);
> }
> The g() function is needed for prevent too smart gcc optimizer from
> noting that nothing could modify a and replace its uses with the =
literal
> 0.
>=20
> The generated object file, after disassembly, is
> 0000000000000000 <f>:
>   0:   89 f8                   mov    %edi,%eax
>   2:   03 05 00 00 00 00       add    0x0(%rip),%eax        # 8 =
<f+0x8>
>                        4: R_X86_64_PC32        .bss-0x4
>   8:   c3                      retq  =20
> The relocation is X86_64_PC32, which takes a symbol as displacement.
> In this case, compiler decided to use the segment base for the symbol,
> but it is allowed to use the local symbol directly.
>=20
> Only linker during the final link (not the incremental -r link) can =
fill
> the word which provides displacement for the rip-relative addressing =
to
> fetch a value.


OK, I see. My expectation (obviously wrong) was that local names would =
be resolved earlier and relocation would not involve a name.

>> Separately to this is that this behaviour also defeats multiple =
definition checking at link time. Modifying my test slightly to have one =
compilation unit with ???static int testvar???, one compilation unit =
with ???int testvar??? and one with ???extern int testvar??? gives nm =
output like this from the .ko file:
>>=20
>> 00000000000000d0 d testvar
>> 00000000000000c0 d testvar
>>=20
>> It is unclear which instance of testvar the ???extern??? declaration =
used for resolution. Simple testing seems to show it was the one with =
the non-static storage class. However, a piece of code depending on the =
previous resolution to local names could break by the addition a file =
with a name clash. Also, not no ???U??? record ??? the ???extern int =
testvar??? declaration has been resolved at this point.
>>=20
>> Making both definitions static gives this:
>>=20
>>                 U testvar
>> 00000000000000c0 d testvar
>> 00000000000000d0 d testvar
>>=20
>> Which testvar will I get at load time? Who knows? Adding a file with =
a static variable with a name clash in another compilation unit can =
change the behaviour of a module.
>>=20
>> The big question for me is: under which legitimate case is this =
feature necessary? The downsides seem significant.
>>=20
>=20
> Well, might be that adding a check for multiple definitions of the =
same
> symbol into the link_elf_obj.c is reasonable, I do not have very =
strong
> opinion against it. But also, I do not see a way to do it in =
non-O(n^2)
> time, which is probably not the best thing to put into kernel.


I see the problems with implementation but breaking the isolation of =
static variables and potentially linking to the wrong variable seems =
obviously bad, especially when done silently. Another approach would be =
do something like decorate the name with a scope identifier as is done =
for a static with function scope or a C++ style anonymous namespace. =
That involves changing the compiler or messing around with generated =
object file in the build system, each of which has its own set of =
complications. However, it avoids runtime checks and changing the module =
format.

Regards,

Jan.








Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5418A15D-C125-4F63-B6DC-3521C0EC3B19>