From nobody Thu May 4 10:11:06 2023 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4QBqNF5pBHz48wqC for ; Thu, 4 May 2023 10:11:25 +0000 (UTC) (envelope-from zlei@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4QBqNF5YNnz49Pj; Thu, 4 May 2023 10:11:25 +0000 (UTC) (envelope-from zlei@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1683195085; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=RDyhdUypupiFy4rGPg4h9nPaLIWuRo9KRZXqFzYnKSE=; b=D4sDSY9WEVgks7LwoLxX5/QtGGZ6ZtNKPxiux+XHkncytfQMIT0ODHJZ4m14yjBiCfFJ1n XIhB1tIctfeSGayXIODQsIwwsSXUEMhsU/CENx3TdOujtAJMSQU7REWAi8x7zSSmnJhynl xfaAgfQbMT2DJFm2B3Kxtkt5r7+68AEYucWSa7YEa3QkEo/Wl0T97Dqm1LhmtqkJJsk0UU uV+PHwGo5QG/R/4BA4WK1UYNCWIfZzW+GhCZOfXTU5GheSVdcDFGh0Y7XsFJKgCqBOFom+ pjUfbv+fiMutmBDHBijPFwGvXqmyjtiv1ndeQN9NEWuzP3s0717sdxc3iB1cmg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1683195085; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=RDyhdUypupiFy4rGPg4h9nPaLIWuRo9KRZXqFzYnKSE=; b=WdF1LRkNmEOKUKs3Z0a5sXKdofxt+MUiRCJ22sZEqx6XY0/Xtai0YeGkngZMvGzn95YlNa 3S6L5oU6NlkcAAX+sGpYLd6wr5mjKKkVQfs9B+FIRluP9OVCsy/YpcI68qMvis2riiDtxq QG5Qcb90SPMQVhclxgP1f1rUiOe4hMSzJ9oKX0XjvT/Us/hTPnC1GHKcPKokrTRcJ/rmyJ IC+mgZ5YJdDTzDQqSXE/+5b731p4uazUB/EHokoDGY41/xIUAXe2lJWnQkiPwG6nfFQna3 oLju0PdzCE0fNkSUWWvR0wEvVRvvoOddg48D+BZKePq1r3FUagVDkEjdW+dUPA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1683195085; a=rsa-sha256; cv=none; b=hxW+VNImrhRIVVGu9YQjf4MoVOTwVQe1kHkUM8sUzs0Na9jw801Cndcb/NKxLxvO/wIVrK wYv0AjrA18Rpj31RoYktwdWBFEA7+W7r9uamjspkpAyl+2StytWHUekjizwwCmEsU0MQW2 ulDKNBdIaVnSNIK7YUkXOWA1ulgWh+ylXKiuvMZ2OrCOEdfaDLXGsFoce91hNgbcCtgeMD t3QHCWUTOLH0LXHSTsIV0Jwspe7eUfRQL4AQkWNBTSFCQ49wlBbCj5l2u89TsZQjZN7Btd 8S/Hv451oT9JxAfbU8n1m406UBpfPFthoyEg+lsdCn0paR78kTHF/dFg8aopoA== Received: from smtpclient.apple (ns1.oxydns.net [45.32.91.63]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) (Authenticated sender: zlei/mail) by smtp.freebsd.org (Postfix) with ESMTPSA id 4QBqNC1QWXz17YD; Thu, 4 May 2023 10:11:22 +0000 (UTC) (envelope-from zlei@FreeBSD.org) From: Zhenlei Huang Message-Id: Content-Type: multipart/alternative; boundary="Apple-Mail=_41DA2938-02A5-4B92-9856-2D3CD7B51E9C" List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.2\)) Subject: Re: Link modules to DYN type Date: Thu, 4 May 2023 18:11:06 +0800 In-Reply-To: <86DC0C8E-56E8-49B9-8441-D64C690FA5F0@FreeBSD.org> Cc: Hans Petter Selasky , FreeBSD CURRENT , Gleb Smirnoff To: Konstantin Belousov References: <97390FE1-1DF5-43A1-A3F4-2B945D681437@FreeBSD.org> <2bb66cac-c7f1-e45b-693a-8afbda05cfa6@freebsd.org> <86DC0C8E-56E8-49B9-8441-D64C690FA5F0@FreeBSD.org> X-Mailer: Apple Mail (2.3696.120.41.1.2) X-ThisMailContainsUnwantedMimeParts: N --Apple-Mail=_41DA2938-02A5-4B92-9856-2D3CD7B51E9C Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii > On Apr 28, 2023, at 12:32 AM, Zhenlei Huang wrote: >=20 >=20 >=20 >> On Apr 26, 2023, at 7:12 PM, Konstantin Belousov > wrote: >>=20 >> On Wed, Apr 26, 2023 at 12:55:02PM +0200, Hans Petter Selasky wrote: >>> On 4/26/23 12:36, Zhenlei Huang wrote: >>>> Hi, >>>>=20 >>>> I'm recently working on https://reviews.freebsd.org/D39638 = (sysctl(9): Enable vnet sysctl = variables be loader tunable), >>>> the changes to `sys/kern/link_elf_obj.c` are runtime tested, but = not those to `sys/kern/link_elf.c` . >>>>=20 >>>> After some hacking I realized that `link_elf.c` is for EXEC = (Executable file) or DYN (Shared object file), and `link_elf_obj.c` is >>>> for REL (Relocatable file). >>>>=20 >>>> ``` >>>> /* link_elf.c */ >>>> static int >>>> link_elf_load_file(linker_class_t cls, const char* filename, >>>> linker_file_t* result) >>>> { >>>> ... >>>> if (hdr->e_type !=3D ET_EXEC && hdr->e_type !=3D ET_DYN) { >>>> error =3D ENOSYS; >>>> goto out; >>>> } >>>> ... >>>> } >>>>=20 >>>>=20 >>>> /* link_elf_obj.c */ >>>> static int >>>> link_elf_load_file(linker_class_t cls, const char *filename, >>>> linker_file_t *result) >>>> { >>>> ... >>>> if (hdr->e_type !=3D ET_REL) { >>>> error =3D ENOSYS; >>>> goto out; >>>> } >>>> ... >>>> } >>>> ``` >>>>=20 >>>> Run the following snip: >>>> ``` >>>> # find /boot/kernel -type f -name "*.ko" -exec readelf -h {} \; | = grep Type >>>> ``` >>>> shows that all the kernel modules' types are `REL (Relocatable = file)`. >>>>=20 >>>> I guess if some module such as if_bridge is linked to DYN type, = then I can do runtime for the changes to `sys/kern/link_elf.c`. >>>>=20 >>>> I'm not familiar with elf and linkers, is that ( compile module and = link it to DYN type ) possible ? >>=20 >> Module file type (shared object vs. object file) depends on = architecture. >> For amd64 modules are objects, while kernel is shared library. >> For arm64 (and all other arches, I believe) modules and kernels are = shared >> libraries. >>=20 >> I think you can link amd64 module as shared object, but this require = enough >> hacking of the build infrastructure. At least I am not aware of a = simple >> knob to switch the produced type. >=20 > I did some hack on `sys/conf/kmod.mk` and finally produced DYN kernel = modules. > The good news is I do some basic sysctl tests, but the bad news is the = module does not function correctly. >=20 > For exampe the if_disc.c >=20 > ``` > static void > vnet_disc_init(const void *unused __unused) > { > /* Reference V_disc_cloner will immediately trigger page fault = panic */ > V_disc_cloner =3D if_clone_simple(discname, disc_clone_create, > disc_clone_destroy, 0); > } > VNET_SYSINIT(vnet_disc_init, SI_SUB_PSEUDO, SI_ORDER_ANY, > vnet_disc_init, NULL); > ``` >=20 > I suspect the relocation is not done correctly for DYN elf kmod on = amd64. >=20 > My local patch to kmod.mk: >=20 > ``` > diff --git a/sys/conf/kmod.mk b/sys/conf/kmod.mk > index 134b150af1d9..1fc5386204a5 100644 > --- a/sys/conf/kmod.mk > +++ b/sys/conf/kmod.mk > @@ -84,6 +84,7 @@ __KLD_SHARED=3Dyes > .else > __KLD_SHARED=3Dno > .endif > +__KLD_SHARED=3Dyes > =20 > .if !empty(CFLAGS:M-O[23s]) && empty(CFLAGS:M-fno-strict-aliasing) > CFLAGS+=3D -fno-strict-aliasing > @@ -167,6 +168,7 @@ CFLAGS+=3D -fno-omit-frame-pointer = -mno-omit-leaf-frame-pointer > ${MACHINE_CPUARCH} =3D=3D "powerpc" > CFLAGS+=3D -fPIC > .endif > +CFLAGS+=3D -fPIC > =20 > # Temporary workaround for PR 196407, which contains the fascinating = details. > # Don't allow clang to use fpu instructions or registers in kernel = modules. > ``` >=20 >=20 > As for https://reviews.freebsd.org/D39638 = , for other platform such as arm, I = think the `link_elf_propagate_vnets()` should work if `parse_vnet()` = works. >=20 > I'll appreciate if someone can test it on platforms those have DYN = type kernel modules.=20 Good news on this! I've managed to test the change for `link_elf.c` on QEMU RISCV . It = works as expected and solid ! >=20 >>=20 >>=20 >>>>=20 >>>=20 >>> Hi, >>>=20 >>> I don't have an answer for you either, but I have seen in the past, = loading >>> kernel modules behaves a bit like libraries, in the following = regard: >>>=20 >>> If two kernel modules define the same global symbol, then no warning = is >>> given and the first loaded symbol definition (I think) is used to = resolve >>> that symbol for all kernel modules, regardless of the prototype. = Probably we >>> should not allow this. That's why building LINT is a good thing, to = avoid >>> this issue. >> No, in-kernel linker does not behave this way. >> Modules need to contain explicit reference to all modules they depend = upon, >> using the MODULE_DEPEND() macro. Only symbols from the dependencies = are >> resolved. >>=20 >> All modules get an implicit reference to kernel. >>=20 >>>=20 >>> Even if we don't have C++ support in the FreeBSD kernel, defining = symbol >>> names the way C++ does for C could be nice for the kernel too, also = with >>> regards to debugging systems. >>>=20 >>> Many times when I don't know what is going on, I do like this: >>>=20 >>> #include >>>=20 >>> .... >>>=20 >>> if (not too fast or my sysctl debug) { >>> printf("My tracer\n"); >>> kdb_backtrace(); >>> } >>>=20 >>> Dtrace can also do this, but not during boot. Just track who is = calling >>> those functions, and you'll probably find the answer to your = question! >>>=20 >>> --HPS >>>=20 >>>>=20 >>>> Best regards, >>>> Zhenlei --Apple-Mail=_41DA2938-02A5-4B92-9856-2D3CD7B51E9C Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii

On Apr 28, 2023, at 12:32 AM, Zhenlei Huang <zlei@FreeBSD.org> = wrote:



On= Apr 26, 2023, at 7:12 PM, Konstantin Belousov <kostikbel@gmail.com>= wrote:

On Wed, Apr 26, 2023 at = 12:55:02PM +0200, Hans Petter Selasky wrote:
On 4/26/23 12:36, Zhenlei Huang wrote:
Hi,

I'm recently = working on https://reviews.freebsd.org/D39638 (sysctl(9): Enable vnet = sysctl variables be loader tunable),
the changes to = `sys/kern/link_elf_obj.c` are runtime tested, but not those to = `sys/kern/link_elf.c` .

After some hacking = I realized that `link_elf.c` is for EXEC (Executable file) or DYN = (Shared object file), and `link_elf_obj.c` is
for REL = (Relocatable file).

```
/* = link_elf.c */
static int
link_elf_load_file(linker_class_t cls, const char* = filename,
    linker_file_t* = result)
{
...
if = (hdr->e_type !=3D ET_EXEC && hdr->e_type !=3D ET_DYN) {
= = error =3D ENOSYS;
goto out;
}
...
}


/* link_elf_obj.c */
static int
link_elf_load_file(linker_class_t cls, const char = *filename,
    linker_file_t = *result)
{
...
if = (hdr->e_type !=3D ET_REL) {
error =3D = ENOSYS;
goto out;
}
...
}
```

Run the = following snip:
```
# find /boot/kernel = -type f -name "*.ko" -exec readelf -h {} \; | grep Type
```
shows that all the kernel modules' types = are `REL (Relocatable file)`.

I guess if = some module such as if_bridge is linked to DYN type, then I can do = runtime for the changes to `sys/kern/link_elf.c`.

I'm not familiar with elf and linkers, is that ( compile = module and link it to DYN type ) possible ?

Module = file type (shared object vs. object file) depends on = architecture.
For = amd64 modules are objects, while kernel is shared library.
For arm64 (and all other = arches, I believe) modules and kernels are shared
libraries.

I think = you can link amd64 module as shared object, but this require = enough
hacking = of the build infrastructure.  At least I am not aware of a = simple
knob to = switch the produced type.

I did some hack on = `sys/conf/kmod.mk` and finally produced DYN kernel modules.
The good news is I do = some basic sysctl tests, but the bad news is the module does not = function correctly.

For exampe the if_disc.c

```
static = void
vnet_disc_init(const void *unused = __unused)
{
/* = Reference V_disc_cloner will immediately trigger page fault panic = */
V_disc_cloner =3D = if_clone_simple(discname, disc_clone_create,
   =  disc_clone_destroy, 0);
}
VNET_SYSINIT(vnet_disc_init, SI_SUB_PSEUDO, = SI_ORDER_ANY,
    = vnet_disc_init, NULL);
```

I suspect the relocation is not done = correctly for  DYN elf kmod on amd64.

My local patch to kmod.mk:

```
diff --git = a/sys/conf/kmod.mk b/sys/conf/kmod.mk
index 134b150af1d9..1fc5386204a5 = 100644
--- = a/sys/conf/kmod.mk
+++ = b/sys/conf/kmod.mk
@@ = -84,6 +84,7 @@ __KLD_SHARED=3Dyes
 .else
 __KLD_SHARED=3Dno
 .endif
+__KLD_SHARED=3Dyes
 
 .if = !empty(CFLAGS:M-O[23s]) && = empty(CFLAGS:M-fno-strict-aliasing)
 CFLAGS+=3D       = -fno-strict-aliasing
@@ = -167,6 +168,7 @@ CFLAGS+=3D    -fno-omit-frame-pointer = -mno-omit-leaf-frame-pointer
  =    ${MACHINE_CPUARCH} =3D=3D "powerpc"
 CFLAGS+=3D       = -fPIC
 .endif
+CFLAGS+=3D       -fPIC
 
 # = Temporary workaround for PR 196407, which contains the fascinating = details.
 # Don't allow = clang to use fpu instructions or registers in kernel = modules.
```


As for https://reviews.freebsd.org/D39638, for other platform = such as arm, I think the `link_elf_propagate_vnets()` should work if = `parse_vnet()` works.

I'll = appreciate if someone can test it on platforms those have DYN type = kernel modules. 

Good news on this!

I've managed to test the change for `link_elf.c` = on QEMU RISCV . It works as expected and solid !







Hi,

I = don't have an answer for you either, but I have seen in the past, = loading
kernel modules behaves a bit like libraries, in = the following regard:

If two kernel modules = define the same global symbol, then no warning is
given = and the first loaded symbol definition (I think) is used to resolve
that symbol for all kernel modules, regardless of the = prototype. Probably we
should not allow this. That's why = building LINT is a good thing, to avoid
this issue.
No, = in-kernel linker does not behave this way.
Modules need to contain = explicit reference to all modules they depend upon,
using the = MODULE_DEPEND() macro.  Only symbols from the dependencies = are
resolved.

All = modules get an implicit reference to kernel.


Even if we don't have C++ support in the FreeBSD = kernel, defining symbol
names the way C++ does for C could = be nice for the kernel too, also with
regards to debugging = systems.

Many times when I don't know what = is going on, I do like this:

#include = <sys/kdb.h>

....

if (not too fast or my sysctl debug) {
 printf("My tracer\n");
 kdb_backtrace();
}

Dtrace can also do this, but not during boot. Just track who = is calling
those functions, and you'll probably find the = answer to your question!

--HPS


Best regards,
Zhenlei

= --Apple-Mail=_41DA2938-02A5-4B92-9856-2D3CD7B51E9C--