Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 6 Nov 2013 04:33:35 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        dt71@gmx.com
Cc:        FreeBSD Hackers <freebsd-hackers@freebsd.org>, jasone@freebsd.org
Subject:   Re: alignment of thread-local storage
Message-ID:  <20131106023335.GZ59496@kib.kiev.ua>
In-Reply-To: <52799E85.4050002@gmx.com>
References:  <52799E85.4050002@gmx.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--hLtrwzwPDOHHtPAe
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Nov 06, 2013 at 02:42:29AM +0100, dt71@gmx.com wrote:
> Starting with revision 191847 of Clang/LLVM, a bus error tends to happen =
in realloc() under special circumstances.
>=20
As I understand, the rev. of clang referenced is higher than what we have
in src/contrib, right ?

>=20
> To reproduce:
>=20
> (1) Compile the following program and link it with the cURL library (or s=
ee (3b)):
> 	#include <sys/types.h>
> 	#include <pwd.h>
> 	int main(void) { getpwuid(0); }
> (2) Compile libc (I use -CURRENT), but when compiling jemalloc.c, specifi=
cally use Clang, revision >=3D191847, and use -march=3Dprescott (or similar=
) and at least -O1.
> (3) Run the program with the libc just built. The program will hopefully =
stop with a bus error.
> (3b) If choosing not to link with any library, then run the program throu=
gh gdb(1). The program will hopefully also hit the bus error.
>=20
Provide the readily build binaries which are neccessary to reproduce the
problem, i.e. libc, the main program and any other dso.

> In other words:
> # echo 'CPUTYPE=3Dprescott' >> /etc/make.conf
> # echo 'CFLAGS=3D-g -O1' >> /etc/make.conf
> # cd /usr/src/lib/libc && make
> # cd
> # cat > x.c <<-EOF
> 	#include <sys/types.h>
> 	#include <pwd.h>
> 	int main(void) { getpwuid(0); }
> 	EOF
> # clang x.c -L /usr/local/lib -lcurl
> # env LD_LIBRARY_PATH=3D/usr/src/lib/libc ./a.out
>=20
> The last 2 command lines can also be:
> # clang x.c
> # env LD_LIBRARY_PATH=3D/usr/src/lib/libc gdb ./a.out
> (gdb) run
>=20
>=20
> The backtrace is:
>=20
> 0x281d4235 in __realloc (ptr=3D0x282db7d0, size=3D<optimized out>)
>      at jemalloc_jemalloc.c:1249
> 1249			ta->allocated +=3D usize;
> (gdb) bt
> #0  0x281d4235 in __realloc (ptr=3D0x282db7d0, size=3D<optimized out>)
>      at jemalloc_jemalloc.c:1249
> #1  0x2826c119 in yygrowstack (data=3D0x282f5144) at nsparser.c:411
> #2  0x2826b7e6 in _nsyyparse () at nsparser.c:470
> #3  0x28276d34 in nss_configure () at /usr/src/lib/libc/net/nsdispatch.c:=
372
> #4  0x28276301 in _nsdispatch (retval=3D0xbfbfdbe4, disp_tab=3D0x282dafb4,
>      database=3D0x282d4982 "passwd", method_name=3D0x282d49b1 "getpwuid_r=
",
>      defaults=3D0x282da594) at /usr/src/lib/libc/net/nsdispatch.c:645
> #5  0x28254e9d in getpwuid_r (uid=3D0, pwd=3D0x282f4f50, buffer=3D0x28c0c=
400 "",
>      bufsize=3D1024, result=3D0xbfbfdbe4) at /usr/src/lib/libc/gen/getpwe=
nt.c:609
> #6  0x28255208 in wrap_getpwuid_r (key=3D..., pwd=3D0x282f4f50,
>      buffer=3D0x28c0c400 "", bufsize=3D1024, res=3D0xbfbfdbe4)
>      at /usr/src/lib/libc/gen/getpwent.c:686
> #7  0x28254fda in getpw (fn=3D0x282551b0 <wrap_getpwuid_r>, key=3D...)
>      at /usr/src/lib/libc/gen/getpwent.c:654
> #8  0x282551a3 in getpwuid (uid=3D0) at /usr/src/lib/libc/gen/getpwent.c:=
714
> #9  0x0804860a in main ()
>=20
>=20
> The current understanding (sort of) of the problem is:
>=20
> - The __jemalloc_thread_allocated_tls variable is updated using a process=
or instruction that requires alignment (paddq), that is, as of Clang/LLVM r=
191847. The variable is defined as:
> __thread thread_allocated_t __attribute__((tls_model("initial-exec"))) th=
read_allocated_tls =3D {0,0};
>=20
> - The variable turns out to be insufficiently aligned, having only 4-byte=
 alignment.
>=20
> <zygoloid> the thread_allocated_tls object *should* be 16-byte aligned
> <zygoloid> is it?
> <zygoloid> (if not, that's the bug; the generated code looks correct and =
good)
> <zygoloid> in the IR we have @thread_allocated_tls =3D global ..., align =
16
>=20
> <o11c> how is storage for TLS variables allocated in the first place?
> <o11c> compilers like to pretend that they exist magically, but they do n=
ot
>=20
> <zygoloid> yeah, Clang emits the TLS variable as a 16-byte aligned symbol
>=20
> <o11c> zygoloid: how are TLS variable actually allocated though?
> <o11c> it can't be done once at load time like for globals
> <o11c> so I'm guessing it must be malloc()ed or something
> <o11c> if so, suspect your malloc
>=20
>=20
> So, what could be the bottom line of this? That is, which one is WRONG --=
 FreeBSD or Clang (or both)?

I do not see anything in the jemalloc sources which indicate that
thread_allocated must be 16-bytes aligned. The natural aligment for the
structure is 8 bytes.

--hLtrwzwPDOHHtPAe
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (FreeBSD)

iQIcBAEBAgAGBQJSeap+AAoJEJDCuSvBvK1BIxAP/iAc4repK8yBbOK+A6qd/NLl
dumHFHpq4O2q7MnfrfwAxe6MLiWnQUnyTtX+H/LCDIH1DW9bxVlhX55F9/6wixCG
NiGrr4nhhgzD/k9bCgCaVN3xDPkAly2X5z0t5H0ywdWySo4JpYjtlEUO8PY3W38q
E9xOFla2RQtPyU9Qu4OprksiPUEMCibZzcwRQLLZs1R2e+bxZIm6d2WTLxqvH6ul
RQTZ4wood16Eal8ggIrd+xc1A3muVSR9uEP5yq16tJukcN+aNmgHjpHDAGdTAM7V
Wd4M8lx2CqluDpKWeDrNosbtUEmLn4LANhBzq4CvRmxvmvExTMPCEntu72Cvcciu
qE04bCS9vxVO5RJwzQtdm2Kxvox65b8whU0fqfwxqRmpX8MP9JsWuRpsHumy7veZ
bHYwIOOEH5MKFlh+itOmPgUDCrLSYH14gA+oYxmuf7Bc3YVe19wlEc/IpyEzRa+O
SmQNu1ReEoDTc/XTKZsVa0woYWoK+LWIFJeWgGN3L1qLoCyyPfUv4t0QCDc5ye5A
5d/2QKmvdBxWfPZev6yzfLEVoU9Z+dT7e2ABrskqTMMDzqw0zCxT9rZW8vwKUneC
KQjBMmL7bAp5oYg6a7WK1wVl4xdbRE+We3UI80niCnY1G6mlt4yuvdXDjythISpz
iiJNJAX7z6MM5qmP1/d9
=1yEy
-----END PGP SIGNATURE-----

--hLtrwzwPDOHHtPAe--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20131106023335.GZ59496>