Date: Fri, 17 Jul 2015 14:24:33 +1000 From: Jan Mikkelsen <janm@transactionware.com> To: Konstantin Belousov <kostikbel@gmail.com> Cc: FreeBSD Stable Mailing List <freebsd-stable@freebsd.org>, Karl Denninger <karl@denninger.net> Subject: Re: amd64 kernel dynamic linking allows extern references to statics Message-ID: <5418A15D-C125-4F63-B6DC-3521C0EC3B19@transactionware.com> In-Reply-To: <20150716130148.GM2404@kib.kiev.ua> References: <13C52D9D-E1E6-4F83-A881-4E867C336B31@transactionware.com> <20150715132744.GD2404@kib.kiev.ua> <C3196483-BFE4-4F65-A685-0FCF23693391@transactionware.com> <20150716130148.GM2404@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
> On 16 Jul 2015, at 23:01, Konstantin Belousov <kostikbel@gmail.com> = wrote: >=20 > On Thu, Jul 16, 2015 at 09:18:15AM +1000, Jan Mikkelsen wrote: >>=20 >>> On 15 Jul 2015, at 11:27 pm, Konstantin Belousov = <kostikbel@gmail.com> wrote: >>>=20 >>> On Wed, Jul 15, 2015 at 06:17:20PM +1000, Jan Mikkelsen wrote: >>>> Hi, >>>>=20 >>>> (All on 10.2-BETA1.) >>>>=20 >>>> I noticed that the latest patch in the bug = https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D187594 = <https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D187594> works on = amd64 but fails to load zfs.ko on i386 with a symbol not found error. >>>>=20 >>>> Looking at the patch, there is one file that has ???extern int = zio_use_uma??? to reference a variable that is declared elsewhere as = ???static int zio_use_uma???. To me this obviously should not work. = However it does work on amd64 but fails on i386. >>>>=20 >>>> Below is a small test case that reproduces the problem. The = generated kernel module loads on amd64 but fails on i386. On amd64 one = compilation unit is accessing a static in from another compilation unit = by declaring the variable ???extern???.=20 >>>>=20 >>>> I haven???t looked further to attempt to find the bug. However, it = looks like a Bad Thing??? to me. >>>>=20 >>>=20 >>> I am not sure that this is fixable. Issue is that amd64 modules are >>> relinked object files, and they might have unresolved relocations = against >>> local symbols. Change like the following probably fix your test = case, >>> but also quite possible would break legitimate local references. >>>=20 >>> diff --git a/sys/kern/link_elf_obj.c b/sys/kern/link_elf_obj.c >>> index 021381d..6fa5276 100644 >>> --- a/sys/kern/link_elf_obj.c >>> +++ b/sys/kern/link_elf_obj.c >>> @@ -1096,7 +1096,8 @@ link_elf_lookup_symbol(linker_file_t lf, const = char *name, c_linker_sym_t *sym) >>>=20 >>> for (i =3D 0, symp =3D ef->ddbsymtab; i < ef->ddbsymcnt; i++, = symp++) { >>> strp =3D ef->ddbstrtab + symp->st_name; >>> - if (symp->st_shndx !=3D SHN_UNDEF && strcmp(name, strp) = =3D=3D 0) { >>> + if (symp->st_shndx !=3D SHN_UNDEF && strcmp(name, strp) = =3D=3D 0 && >>> + ELF_ST_BIND(symp->st_info) !=3D STB_LOCAL) { >>> *sym =3D (c_linker_sym_t) symp; >>> return 0; >>> } >>=20 >> I don???t know why there could be an unresolved relocation against a = local symbol. My (admittedly trivial) tests with nm and static = functions/variables have not led to a ???U??? record. Under what = circumstances would this happen? >=20 > Assembler cannot know the final location for any non-local data, so > any variable, global or static, requires a relocation in the object = file > for the final code to work properly. Look at the most trivial example >=20 > static int a; > int > f(int x) > { > return (a + x); > } > int *g(void) > { > return (&a); > } > The g() function is needed for prevent too smart gcc optimizer from > noting that nothing could modify a and replace its uses with the = literal > 0. >=20 > The generated object file, after disassembly, is > 0000000000000000 <f>: > 0: 89 f8 mov %edi,%eax > 2: 03 05 00 00 00 00 add 0x0(%rip),%eax # 8 = <f+0x8> > 4: R_X86_64_PC32 .bss-0x4 > 8: c3 retq =20 > The relocation is X86_64_PC32, which takes a symbol as displacement. > In this case, compiler decided to use the segment base for the symbol, > but it is allowed to use the local symbol directly. >=20 > Only linker during the final link (not the incremental -r link) can = fill > the word which provides displacement for the rip-relative addressing = to > fetch a value. OK, I see. My expectation (obviously wrong) was that local names would = be resolved earlier and relocation would not involve a name. >> Separately to this is that this behaviour also defeats multiple = definition checking at link time. Modifying my test slightly to have one = compilation unit with ???static int testvar???, one compilation unit = with ???int testvar??? and one with ???extern int testvar??? gives nm = output like this from the .ko file: >>=20 >> 00000000000000d0 d testvar >> 00000000000000c0 d testvar >>=20 >> It is unclear which instance of testvar the ???extern??? declaration = used for resolution. Simple testing seems to show it was the one with = the non-static storage class. However, a piece of code depending on the = previous resolution to local names could break by the addition a file = with a name clash. Also, not no ???U??? record ??? the ???extern int = testvar??? declaration has been resolved at this point. >>=20 >> Making both definitions static gives this: >>=20 >> U testvar >> 00000000000000c0 d testvar >> 00000000000000d0 d testvar >>=20 >> Which testvar will I get at load time? Who knows? Adding a file with = a static variable with a name clash in another compilation unit can = change the behaviour of a module. >>=20 >> The big question for me is: under which legitimate case is this = feature necessary? The downsides seem significant. >>=20 >=20 > Well, might be that adding a check for multiple definitions of the = same > symbol into the link_elf_obj.c is reasonable, I do not have very = strong > opinion against it. But also, I do not see a way to do it in = non-O(n^2) > time, which is probably not the best thing to put into kernel. I see the problems with implementation but breaking the isolation of = static variables and potentially linking to the wrong variable seems = obviously bad, especially when done silently. Another approach would be = do something like decorate the name with a scope identifier as is done = for a static with function scope or a C++ style anonymous namespace. = That involves changing the compiler or messing around with generated = object file in the build system, each of which has its own set of = complications. However, it avoids runtime checks and changing the module = format. Regards, Jan.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5418A15D-C125-4F63-B6DC-3521C0EC3B19>