Date: Wed, 4 Jul 2012 23:33:16 +0900 From: Taku YAMAMOTO <taku@tackymt.homeip.net> To: freebsd-current@freebsd.org Subject: FYI: SIGBUS with world built by clang Message-ID: <20120704233316.70ec8654.taku@tackymt.homeip.net>
next in thread | raw e-mail | index | archive | help
For people having SIGBUS with clang-build world + gcc-build binaries, In short words, for any libraries (and never forget about rtld-elf!) which are potentially called from arbitrary binaries, compile them with either -mstackrealign or -mstack-alignment=8! The detail is as follows. I've observed that clang carelessly expects the stack being aligned at 16 byte boundary. For example, the following code: #include <stdarg.h> #include <stdio.h> int foo(const char *format,...) { int ret; va_list ap; FILE f = {}; va_start(ap, format); ret = vfprintf(&f, format, ap); va_end(ap); return ret; } which turns into: pushl %ebp movl %esp, %ebp subl $264, %esp # imm = 0x108 xorps %xmm0, %xmm0 movaps %xmm0, -40(%ebp) movaps %xmm0, -56(%ebp) (snip; lots of movaps insns follows) which results in SIGBUS if %esp - 4 is not at 16 byte boundary. (Note: movaps expects the address aligned to 16 bytes!) This problem becomes visible when such functions get called from binaries compiled with other compilers which don't care about stack alignment. If the above code is compiled by clang with -mstackrealign: pushl %ebp movl %esp, %ebp andl $-16, %esp subl $272, %esp # imm = 0x110 xorps %xmm0, %xmm0 movaps %xmm0, 224(%esp) movaps %xmm0, 208(%esp) (snip; lots of movaps insns follows) it tries to align the stack prior to allocating local variables thus no problem. If the above code is compiled by clang with -mstack-alignment=8: pushl %ebp movl %esp, %ebp pushl %esi subl $252, %esp leal -240(%ebp), %esi movl %esi, (%esp) movl $232, 8(%esp) movl $0, 4(%esp) calll memset (snip) it calls memset instead of a bunch of movaps to clear the storage thus no problem, of course. # I don't know why clang doesn't utilize rep stosl, though. Pros and cons: -mstackrealign Pros: no function calls to memset probably faster because of SSE Cons: use of SSE means the need of saving FP registers potentially more stack consumption -mstack-alignment=# Pros: normal and predictive stack consumption don't spill SSE registers; no extra overhead on context switching Cons: depends on memset -- -|-__ YAMAMOTO, Taku | __ < <taku@tackymt.homeip.net> - A chicken is an egg's way of producing more eggs. -
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120704233316.70ec8654.taku>