Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 10 Oct 2018 15:11:56 +0000
From:      bugzilla-noreply@freebsd.org
To:        toolchain@FreeBSD.org
Subject:   [Bug 230857] loading carp module panic i386 kernel (VIMAGE related)
Message-ID:  <bug-230857-29464-g0n14qQGMj@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-230857-29464@https.bugs.freebsd.org/bugzilla/>
References:  <bug-230857-29464@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D230857

Bjoern A. Zeeb <bz@FreeBSD.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dim@FreeBSD.org
           Keywords|                            |toolchain

--- Comment #5 from Bjoern A. Zeeb <bz@FreeBSD.org> ---
I'll start describing the problem from a reduced piece of code, which is no=
t as
big as carp, replicating carpstats, assuming VIMAGE is on:

--------
#include <sys/param.h>
#include <sys/kernel.h>
#include <sys/systm.h>
#include <sys/types.h>
#include <sys/mbuf.h>
#include <sys/counter.h>
#include <net/vnet.h>

struct xstats {
        uint64_t        foo1;
        uint64_t        bar1;
        uint64_t        baz1;
        uint64_t        mad1;

        uint64_t        foo2;
        uint64_t        bar2;
        uint64_t        baz2;
        uint64_t        mad2;

        uint64_t        foo3;
        uint64_t        bar3;
        uint64_t        baz3;
        uint64_t        mad3;

        uint64_t        foo4;
        uint64_t        bar4;
        uint64_t        baz4;
        uint64_t        mad4;
};

VNET_PCPUSTAT_DEFINE(struct xstats, xstats);
VNET_PCPUSTAT_SYSINIT(xstats);
--------

This unrolls into:

 1 struct _hack;
 2 counter_u64_t   vnet_entry_xstats[sizeof(struct xstats) / sizeof(uint64_=
t)]
__attribute__((__section__("set_vnet"))) __attribute__((__used__));
 3
 4
 5 static void
 6 vnet_xstats_init(const void *unused)
 7 {
 8         do {
 9                 for (int i =3D 0; i < (sizeof((*(__typeof(vnet_entry_xst=
ats)
*) (((((__curthread())->td_vnet))->vnet_data_base) + (uintptr_t) &
vnet_entry_xstats))) / sizeof(counter_u64_t)); i++)
10                         ((*(__typeof(vnet_entry_xstats) *)
(((((__curthread())->td_vnet))->vnet_data_base) + (uintptr_t) &
vnet_entry_xstats)))[i] =3D counter_u64_alloc((0x0002));
11         } while (0);
12 }


In essance what looks so complicated is (on a per vnet base):
        void *array[16];
        for (i=3D0 ; i<16; i++)
                array[i] =3D alloc();


The above (with the vnet bits), on i386, is translated into:

00000340 <vnet_xstats_init>:
 340:   55                      push   %ebp
 341:   89 e5                   mov    %esp,%ebp
 343:   56                      push   %esi
 344:   50                      push   %eax
 345:   be c0 ff ff ff          mov    $0xffffffc0,%esi
 34a:   90                      nop
 34b:   90                      nop
 34c:   90                      nop
 34d:   90                      nop
 34e:   90                      nop
 34f:   90                      nop
 350:   c7 04 24 02 00 00 00    movl   $0x2,(%esp)
 357:   e8 fc ff ff ff          call   358 <vnet_xstats_init+0x18>
                        358: R_386_PC32 counter_u64_alloc
 35c:   64 8b 0d 00 00 00 00    mov    %fs:0x0,%ecx
 363:   8b 89 1c 03 00 00       mov    0x31c(%ecx),%ecx
 369:   8b 49 1c                mov    0x1c(%ecx),%ecx
 36c:   89 84 31 88 14 00 00    mov    %eax,0x1488(%ecx,%esi,1)
                        36f: R_386_RELATIVE     *ABS*
 373:   83 c6 04                add    $0x4,%esi
 376:   75 d8                   jne    350 <vnet_xstats_init+0x10>
 378:   83 c4 04                add    $0x4,%esp
 37b:   5e                      pop    %esi
 37c:   5d                      pop    %ebp
 37d:   c3                      ret


Now here's the problem:

__start_set_vnet is 0x1448
__stop_set_vnet is 0x1488


The problem is that the code generated goes like this:

%esi =3D -64
repeat:
%eax =3D alloc()
We do all the curthread->td_vnet->vnet_data_base in %ecx and then do
    mov    %eax,0x1488(%ecx,%esi,1)=20=20=20
    Which is:
    move the alloc() result into curthread->td_vnet->vnet_data_base + 0x148=
8 +
(1 * -64)
    Now 0x1488 - 64 gets us to the beginning of the array[] or array[0].

%esi +=3D 4
    So next iteration it'll be 0x1488 - 60 or array[1] ... and so on.
while %esi !=3D 0 goto repeat;


It's an easy way to to the for(i=3D0; i<16; i++) loop.

The problem is that 0x1488 was not relocated.

When we are going over the relocations and calling into elf_relocaddr() the
check for VIMAGE is:

      if (x >=3D ef->vnet_start && x < ef->vnet_stop) {

In our case we have an *ABS* R_386_RELATIVE of 0x1488, which is =3D=3D
ef->vnet_stop but not < ef->vnet_stop.


The real problem is that with non-simple-types the code generated with an
absolute relocation might be just outside the range.  We cannot adjust the
check as there might be a simple-type following in the next section which w=
ould
then be relocated.



For CARP this showed up because the VNET_PCPUSTAT_DEFINE() went into the VN=
ET
section the last and hence the problem showed up.  If there was any other, =
say
int V_Foo after it, we'd never have noticed.   We cannot fully control the
order in which symbols go into the section, or at least not to the extend w=
e'd
like to, so we cannot make sure there's always a char at the end.

The only solution arichardson and I agreed to would be to add 1 byte of pad=
ding
to the end of the section.


Using BYTE(1) in a linker script however would always create a set_vnet sec=
tion
in all kernel modules even if they do not have any virtualized objects.

Using a
https://sourceware.org/binutils/docs-2.31/ld/Output-Section-Discarding.html=
#Output-Section-Discarding
. =3D . + <0|sym>; should not create the section.

This leads to a multi-stage solution:

(1) add a symbol based on SIZEOF(set_vnet) which is either 0 or 1.   SIZEOF=
()
does not create a section if it does not exist.
(2) pass the *(set_vnet) through and then increment "." by the amount of the
symbol from (1); which if there is a section it will increase the section b=
y 1
byte and any non-simple-type objects resulting in absolute relocations on t=
he
boundary should also be relocated.  If there is no section . =3D . + 0; wil=
l not
create the empty section.

The problem is that the symbol from (1) relies on the section size to be kn=
own
which we don't seem to in the first pass.
The second problem is that the symbol from (1) is not immediately visible
(despite it should me; XXX sourceware reference for that)

So we really have to do multiple linker invocations with different linker
scripts.

With ld.bfd an empty set_vnet section will not show up, only the symbol from
(1) will be left behind and that's fine.
With ld.lld 6 it will also leave an empty set_vnet section behind;  emaste
points me to https://reviews.llvm.org/rL325887 where this had been under
discussion already with dim, him, and lld developers.   It's not beautiful,=
 but
should not harm either (XXX to be tested).


We need to exclude "firmware" from the dance and for now make it i386 speci=
fic.

The entire problem described above for VNET also applies to DPCPU (set_pcpu=
).

I'll attached a preliminary patch which seems to hack this together and onc=
e I
added a bit more description etc to the files, I'll put it into Phab as well
(if you have comments earlier let me know).



I hope this all kind-of makes any sense ;-)

--=20
You are receiving this mail because:
You are on the CC list for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-230857-29464-g0n14qQGMj>