From owner-freebsd-stable@freebsd.org Fri Jul 17 04:24:47 2015 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 89D239A2ABD for ; Fri, 17 Jul 2015 04:24:47 +0000 (UTC) (envelope-from janm@transactionware.com) Received: from mail3.transactionware.com (mail.transactionware.com [203.14.245.7]) by mx1.freebsd.org (Postfix) with SMTP id ED9991AC9 for ; Fri, 17 Jul 2015 04:24:46 +0000 (UTC) (envelope-from janm@transactionware.com) Received: (qmail 3853 invoked by uid 907); 17 Jul 2015 04:24:34 -0000 Received: from Unknown (HELO jmmacpro.tmst.com.au) (203.14.245.130) (smtp-auth username janm, mechanism plain) by mail3.transactionware.com (qpsmtpd/0.84) with (ECDHE-RSA-AES256-SHA encrypted) ESMTPSA; Fri, 17 Jul 2015 14:24:34 +1000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2102\)) Subject: Re: amd64 kernel dynamic linking allows extern references to statics From: Jan Mikkelsen In-Reply-To: <20150716130148.GM2404@kib.kiev.ua> Date: Fri, 17 Jul 2015 14:24:33 +1000 Cc: FreeBSD Stable Mailing List , Karl Denninger Content-Transfer-Encoding: quoted-printable Message-Id: <5418A15D-C125-4F63-B6DC-3521C0EC3B19@transactionware.com> References: <13C52D9D-E1E6-4F83-A881-4E867C336B31@transactionware.com> <20150715132744.GD2404@kib.kiev.ua> <20150716130148.GM2404@kib.kiev.ua> To: Konstantin Belousov X-Mailer: Apple Mail (2.2102) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Jul 2015 04:24:47 -0000 > On 16 Jul 2015, at 23:01, Konstantin Belousov = wrote: >=20 > On Thu, Jul 16, 2015 at 09:18:15AM +1000, Jan Mikkelsen wrote: >>=20 >>> On 15 Jul 2015, at 11:27 pm, Konstantin Belousov = wrote: >>>=20 >>> On Wed, Jul 15, 2015 at 06:17:20PM +1000, Jan Mikkelsen wrote: >>>> Hi, >>>>=20 >>>> (All on 10.2-BETA1.) >>>>=20 >>>> I noticed that the latest patch in the bug = https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D187594 = works on = amd64 but fails to load zfs.ko on i386 with a symbol not found error. >>>>=20 >>>> Looking at the patch, there is one file that has ???extern int = zio_use_uma??? to reference a variable that is declared elsewhere as = ???static int zio_use_uma???. To me this obviously should not work. = However it does work on amd64 but fails on i386. >>>>=20 >>>> Below is a small test case that reproduces the problem. The = generated kernel module loads on amd64 but fails on i386. On amd64 one = compilation unit is accessing a static in from another compilation unit = by declaring the variable ???extern???.=20 >>>>=20 >>>> I haven???t looked further to attempt to find the bug. However, it = looks like a Bad Thing??? to me. >>>>=20 >>>=20 >>> I am not sure that this is fixable. Issue is that amd64 modules are >>> relinked object files, and they might have unresolved relocations = against >>> local symbols. Change like the following probably fix your test = case, >>> but also quite possible would break legitimate local references. >>>=20 >>> diff --git a/sys/kern/link_elf_obj.c b/sys/kern/link_elf_obj.c >>> index 021381d..6fa5276 100644 >>> --- a/sys/kern/link_elf_obj.c >>> +++ b/sys/kern/link_elf_obj.c >>> @@ -1096,7 +1096,8 @@ link_elf_lookup_symbol(linker_file_t lf, const = char *name, c_linker_sym_t *sym) >>>=20 >>> for (i =3D 0, symp =3D ef->ddbsymtab; i < ef->ddbsymcnt; i++, = symp++) { >>> strp =3D ef->ddbstrtab + symp->st_name; >>> - if (symp->st_shndx !=3D SHN_UNDEF && strcmp(name, strp) = =3D=3D 0) { >>> + if (symp->st_shndx !=3D SHN_UNDEF && strcmp(name, strp) = =3D=3D 0 && >>> + ELF_ST_BIND(symp->st_info) !=3D STB_LOCAL) { >>> *sym =3D (c_linker_sym_t) symp; >>> return 0; >>> } >>=20 >> I don???t know why there could be an unresolved relocation against a = local symbol. My (admittedly trivial) tests with nm and static = functions/variables have not led to a ???U??? record. Under what = circumstances would this happen? >=20 > Assembler cannot know the final location for any non-local data, so > any variable, global or static, requires a relocation in the object = file > for the final code to work properly. Look at the most trivial example >=20 > static int a; > int > f(int x) > { > return (a + x); > } > int *g(void) > { > return (&a); > } > The g() function is needed for prevent too smart gcc optimizer from > noting that nothing could modify a and replace its uses with the = literal > 0. >=20 > The generated object file, after disassembly, is > 0000000000000000 : > 0: 89 f8 mov %edi,%eax > 2: 03 05 00 00 00 00 add 0x0(%rip),%eax # 8 = > 4: R_X86_64_PC32 .bss-0x4 > 8: c3 retq =20 > The relocation is X86_64_PC32, which takes a symbol as displacement. > In this case, compiler decided to use the segment base for the symbol, > but it is allowed to use the local symbol directly. >=20 > Only linker during the final link (not the incremental -r link) can = fill > the word which provides displacement for the rip-relative addressing = to > fetch a value. OK, I see. My expectation (obviously wrong) was that local names would = be resolved earlier and relocation would not involve a name. >> Separately to this is that this behaviour also defeats multiple = definition checking at link time. Modifying my test slightly to have one = compilation unit with ???static int testvar???, one compilation unit = with ???int testvar??? and one with ???extern int testvar??? gives nm = output like this from the .ko file: >>=20 >> 00000000000000d0 d testvar >> 00000000000000c0 d testvar >>=20 >> It is unclear which instance of testvar the ???extern??? declaration = used for resolution. Simple testing seems to show it was the one with = the non-static storage class. However, a piece of code depending on the = previous resolution to local names could break by the addition a file = with a name clash. Also, not no ???U??? record ??? the ???extern int = testvar??? declaration has been resolved at this point. >>=20 >> Making both definitions static gives this: >>=20 >> U testvar >> 00000000000000c0 d testvar >> 00000000000000d0 d testvar >>=20 >> Which testvar will I get at load time? Who knows? Adding a file with = a static variable with a name clash in another compilation unit can = change the behaviour of a module. >>=20 >> The big question for me is: under which legitimate case is this = feature necessary? The downsides seem significant. >>=20 >=20 > Well, might be that adding a check for multiple definitions of the = same > symbol into the link_elf_obj.c is reasonable, I do not have very = strong > opinion against it. But also, I do not see a way to do it in = non-O(n^2) > time, which is probably not the best thing to put into kernel. I see the problems with implementation but breaking the isolation of = static variables and potentially linking to the wrong variable seems = obviously bad, especially when done silently. Another approach would be = do something like decorate the name with a scope identifier as is done = for a static with function scope or a C++ style anonymous namespace. = That involves changing the compiler or messing around with generated = object file in the build system, each of which has its own set of = complications. However, it avoids runtime checks and changing the module = format. Regards, Jan.