Date: Sun, 7 Feb 2010 14:06:18 -0500 From: Ryan Stone <rysto32@gmail.com> To: freebsd-ports@freebsd.org Cc: stas@FreeBSD.org Subject: TLS(and by extension all threading) completely broken in Valgrind on i386/amd64 Message-ID: <bc2d971002071106s53356f7p30696c9abc5f2795@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
--00504502d3bc268f15047f07610a Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I've been trying out valgrind on some threaded FreeBSD applications but they've been deadlocking at startup. =A0I've identified that the root cause is that FreeBSD's thread local storage is not being emulated properly by valgrind. =A0The problem on amd64 is obvious: valgrind gives an invalid opcode error when the program tries to execute any instruction that accesses the gs register. =A0On i386 the problem is much more subtle. I've attached two test applications that demonstrate the problem. =A0In pthread_self.c, I create one thread which periodically prints pthread_self(), and then 10 seconds later I create a second thread. After the second thread is created, the first thread believes that it is the second thread. =A0Here's an example invocation: =3D=3D883=3D=3D Memcheck, a memory error detector =3D=3D883=3D=3D Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et= al. =3D=3D883=3D=3D Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyrigh= t info =3D=3D883=3D=3D Command: ./pthread_self =3D=3D883=3D=3D 0x18c180 0x18c180 0x18c180 0x18c180 0x18c180 0x18c180 0x18c180 0x18c180 0x18c180 1st: 0x18c180 2nd: 0x18d390 0x18d390 0x18d390 0x18d390 0x18d390 0x18d390 0x18d390 0x18d390 0x18d390 0x18d390 0x18d390 Note that first thread correctly prints that its pthread_t is 0x18c180 before the second thread is created, but after the second thread is created both threads report that they are 0x18d390! As far as I can tell, all threads use the thread local storage of the last thread created. This completely breaks libthr's mutexes, as mutex.c demonstrates. In that test app, the main thread acquires a mutex and then creates a new thread, then it tries to unlock the mutex. The unlock fails with EPERM, which is returned by pthread_mutex_unlock when a thread tries to acquire a mutex that it does not own. This behaviour is likely the cause of all of the "false positives" from helgrind. Helgrind is correctly noting that the libthr internals are using the same memory in different threads, because the threads think that they are touching thread-local memory. I've found the point in the thr_new syscall wrapper where valgrind notes the TLS area, but I can't figure out how it uses the information, so I'm stuck in figuring out why valgrind is getting this wrong. Anyone have any ideas? I'm not subscribed to this list so please CC me on any replies. --00504502d3bc268f15047f07610a--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bc2d971002071106s53356f7p30696c9abc5f2795>