Date: Wed, 10 Oct 2018 15:11:56 +0000 From: bugzilla-noreply@freebsd.org To: toolchain@FreeBSD.org Subject: [Bug 230857] loading carp module panic i386 kernel (VIMAGE related) Message-ID: <bug-230857-29464-g0n14qQGMj@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-230857-29464@https.bugs.freebsd.org/bugzilla/> References: <bug-230857-29464@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D230857 Bjoern A. Zeeb <bz@FreeBSD.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dim@FreeBSD.org Keywords| |toolchain --- Comment #5 from Bjoern A. Zeeb <bz@FreeBSD.org> --- I'll start describing the problem from a reduced piece of code, which is no= t as big as carp, replicating carpstats, assuming VIMAGE is on: -------- #include <sys/param.h> #include <sys/kernel.h> #include <sys/systm.h> #include <sys/types.h> #include <sys/mbuf.h> #include <sys/counter.h> #include <net/vnet.h> struct xstats { uint64_t foo1; uint64_t bar1; uint64_t baz1; uint64_t mad1; uint64_t foo2; uint64_t bar2; uint64_t baz2; uint64_t mad2; uint64_t foo3; uint64_t bar3; uint64_t baz3; uint64_t mad3; uint64_t foo4; uint64_t bar4; uint64_t baz4; uint64_t mad4; }; VNET_PCPUSTAT_DEFINE(struct xstats, xstats); VNET_PCPUSTAT_SYSINIT(xstats); -------- This unrolls into: 1 struct _hack; 2 counter_u64_t vnet_entry_xstats[sizeof(struct xstats) / sizeof(uint64_= t)] __attribute__((__section__("set_vnet"))) __attribute__((__used__)); 3 4 5 static void 6 vnet_xstats_init(const void *unused) 7 { 8 do { 9 for (int i =3D 0; i < (sizeof((*(__typeof(vnet_entry_xst= ats) *) (((((__curthread())->td_vnet))->vnet_data_base) + (uintptr_t) & vnet_entry_xstats))) / sizeof(counter_u64_t)); i++) 10 ((*(__typeof(vnet_entry_xstats) *) (((((__curthread())->td_vnet))->vnet_data_base) + (uintptr_t) & vnet_entry_xstats)))[i] =3D counter_u64_alloc((0x0002)); 11 } while (0); 12 } In essance what looks so complicated is (on a per vnet base): void *array[16]; for (i=3D0 ; i<16; i++) array[i] =3D alloc(); The above (with the vnet bits), on i386, is translated into: 00000340 <vnet_xstats_init>: 340: 55 push %ebp 341: 89 e5 mov %esp,%ebp 343: 56 push %esi 344: 50 push %eax 345: be c0 ff ff ff mov $0xffffffc0,%esi 34a: 90 nop 34b: 90 nop 34c: 90 nop 34d: 90 nop 34e: 90 nop 34f: 90 nop 350: c7 04 24 02 00 00 00 movl $0x2,(%esp) 357: e8 fc ff ff ff call 358 <vnet_xstats_init+0x18> 358: R_386_PC32 counter_u64_alloc 35c: 64 8b 0d 00 00 00 00 mov %fs:0x0,%ecx 363: 8b 89 1c 03 00 00 mov 0x31c(%ecx),%ecx 369: 8b 49 1c mov 0x1c(%ecx),%ecx 36c: 89 84 31 88 14 00 00 mov %eax,0x1488(%ecx,%esi,1) 36f: R_386_RELATIVE *ABS* 373: 83 c6 04 add $0x4,%esi 376: 75 d8 jne 350 <vnet_xstats_init+0x10> 378: 83 c4 04 add $0x4,%esp 37b: 5e pop %esi 37c: 5d pop %ebp 37d: c3 ret Now here's the problem: __start_set_vnet is 0x1448 __stop_set_vnet is 0x1488 The problem is that the code generated goes like this: %esi =3D -64 repeat: %eax =3D alloc() We do all the curthread->td_vnet->vnet_data_base in %ecx and then do mov %eax,0x1488(%ecx,%esi,1)=20=20=20 Which is: move the alloc() result into curthread->td_vnet->vnet_data_base + 0x148= 8 + (1 * -64) Now 0x1488 - 64 gets us to the beginning of the array[] or array[0]. %esi +=3D 4 So next iteration it'll be 0x1488 - 60 or array[1] ... and so on. while %esi !=3D 0 goto repeat; It's an easy way to to the for(i=3D0; i<16; i++) loop. The problem is that 0x1488 was not relocated. When we are going over the relocations and calling into elf_relocaddr() the check for VIMAGE is: if (x >=3D ef->vnet_start && x < ef->vnet_stop) { In our case we have an *ABS* R_386_RELATIVE of 0x1488, which is =3D=3D ef->vnet_stop but not < ef->vnet_stop. The real problem is that with non-simple-types the code generated with an absolute relocation might be just outside the range. We cannot adjust the check as there might be a simple-type following in the next section which w= ould then be relocated. For CARP this showed up because the VNET_PCPUSTAT_DEFINE() went into the VN= ET section the last and hence the problem showed up. If there was any other, = say int V_Foo after it, we'd never have noticed. We cannot fully control the order in which symbols go into the section, or at least not to the extend w= e'd like to, so we cannot make sure there's always a char at the end. The only solution arichardson and I agreed to would be to add 1 byte of pad= ding to the end of the section. Using BYTE(1) in a linker script however would always create a set_vnet sec= tion in all kernel modules even if they do not have any virtualized objects. Using a https://sourceware.org/binutils/docs-2.31/ld/Output-Section-Discarding.html= #Output-Section-Discarding . =3D . + <0|sym>; should not create the section. This leads to a multi-stage solution: (1) add a symbol based on SIZEOF(set_vnet) which is either 0 or 1. SIZEOF= () does not create a section if it does not exist. (2) pass the *(set_vnet) through and then increment "." by the amount of the symbol from (1); which if there is a section it will increase the section b= y 1 byte and any non-simple-type objects resulting in absolute relocations on t= he boundary should also be relocated. If there is no section . =3D . + 0; wil= l not create the empty section. The problem is that the symbol from (1) relies on the section size to be kn= own which we don't seem to in the first pass. The second problem is that the symbol from (1) is not immediately visible (despite it should me; XXX sourceware reference for that) So we really have to do multiple linker invocations with different linker scripts. With ld.bfd an empty set_vnet section will not show up, only the symbol from (1) will be left behind and that's fine. With ld.lld 6 it will also leave an empty set_vnet section behind; emaste points me to https://reviews.llvm.org/rL325887 where this had been under discussion already with dim, him, and lld developers. It's not beautiful,= but should not harm either (XXX to be tested). We need to exclude "firmware" from the dance and for now make it i386 speci= fic. The entire problem described above for VNET also applies to DPCPU (set_pcpu= ). I'll attached a preliminary patch which seems to hack this together and onc= e I added a bit more description etc to the files, I'll put it into Phab as well (if you have comments earlier let me know). I hope this all kind-of makes any sense ;-) --=20 You are receiving this mail because: You are on the CC list for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-230857-29464-g0n14qQGMj>