Date: Thu, 16 Jun 2011 03:31:38 -0400 From: Nathaniel W Filardo <nwf@cs.jhu.edu> To: freebsd-current@freebsd.org, freebsd-sparc64@freebsd.org Subject: TLS bug? Message-ID: <20110616073138.GL31996@gradx.cs.jhu.edu>
next in thread | raw e-mail | index | archive | help
[-- Attachment #1 --]
I have a few applications (bonnie++ and mysql, specifically, both from
ports) which trip over the assertion in
lib/libc/stdlib/malloc.c:/^_malloc_thread_cleanup that
> assert(tcache != (void *)(uintptr_t)1);
I have patched malloc.c thus:
> --- a/lib/libc/stdlib/malloc.c
> +++ b/lib/libc/stdlib/malloc.c
> @@ -1108,7 +1108,7 @@ static __thread arena_t *arenas_map TLS_MODEL;
>
> #ifdef MALLOC_TCACHE
> /* Map of thread-specific caches. */
> -static __thread tcache_t *tcache_tls TLS_MODEL;
> +__thread tcache_t *tcache_tls TLS_MODEL;
>
> /*
> * Number of cache slots for each bin in the thread cache, or 0 if tcache
> * is
> @@ -6184,10 +6184,17 @@ _malloc_thread_cleanup(void)
> #ifdef MALLOC_TCACHE
> tcache_t *tcache = tcache_tls;
>
> + fprintf(stderr, "_m_t_c for %d:%lu with %p\n",
> + getpid(),
> + (unsigned long) _pthread_self(),
> + tcache);
> +
> if (tcache != NULL) {
> - assert(tcache != (void *)(uintptr_t)1);
> - tcache_destroy(tcache);
> - tcache_tls = (void *)(uintptr_t)1;
> + /* assert(tcache != (void *)(uintptr_t)1); */
> + if((uintptr_t)tcache != (uintptr_t)1) {
> + tcache_destroy(tcache);
> + tcache_tls = (void *)(uintptr_t)1;
> + }
and libthr/thread/thr_create.c thus:
> --- a/lib/libthr/thread/thr_create.c
> +++ b/lib/libthr/thread/thr_create.c
> @@ -243,6 +243,8 @@ create_stack(struct pthread_attr *pattr)
> return (ret);
> }
>
> +extern __thread void *tcache_tls;
> +
> static void
> thread_start(struct pthread *curthread)
> {
> @@ -280,6 +282,11 @@ thread_start(struct pthread *curthread)
> curthread->attr.stacksize_attr;
> #endif
>
> + fprintf(stderr, "t_s for %d:%lu with %p\n",
> + getpid(),
> + (unsigned long) _pthread_self(),
> + tcache_tls);
> +
> /* Run the current thread's start routine with argument: */
> _pthread_exit(curthread->start_routine(curthread->arg));
>
to attempt to debug this issue. With those changes in place, bonnie++'s
execution looks like this:
>[...]
> Writing a byte at a time...done
> Writing intelligently...done
> Rewriting...done
> Reading a byte at a time...done
> Reading intelligently...done
> t_s for 79654:1086343168 with 0x0
> t_s for 79654:1086345216 with 0x0
> t_s for 79654:1086346240 with 0x0
> t_s for 79654:1086347264 with 0x0
> t_s for 79654:1086344192 with 0x0
> start 'em...done...done...done...done..._m_t_c for 79654:1086344192 with
> 0x41404400
> _m_t_c for 79654:1086346240 with 0x40d2c400
> _m_t_c for 79654:1086343168 with 0x41404200
> _m_t_c for 79654:1086345216 with 0x41804200
> done...
> _m_t_c for 79654:1086347264 with 0x41004200
> Create files in sequential order...done.
> Stat files in sequential order...done.
> Delete files in sequential order...done.
> Create files in random order...done.
> Stat files in random order...done.
> Delete files in random order...done.
> 1.96,1.96,hydra.priv.oc.ietfng.org,1,1308217772,10M,,7,81,2644,7,3577,14,34,93,+++++,+++,773.7,61,16,,,
> ,,2325,74,13016,99,2342,86,3019,91,11888,99,2184,89,16397ms,1237ms,671ms,2009ms,177us,1305ms,489ms,1029
> us,270ms,140ms,53730us,250ms
> Writing a byte at a time...done
> Writing intelligently...done
> Rewriting...done
> Reading a byte at a time...done
> Reading intelligently...done
> t_s for 79654:1086343168 with 0x1
> t_s for 79654:1086346240 with 0x1
> t_s for 79654:1086345216 with 0x1
> t_s for 79654:1086347264 with 0x1
> t_s for 79654:1086344192 with 0x1
> start 'em...done...done...done...done...done...
> _m_t_c for 79654:1086347264 with 0x1
> _m_t_c for 79654:1086344192 with 0x1
> _m_t_c for 79654:1086343168 with 0x1
>[...]
So what seems to be happening is that the TLS area is being set up
incorrectly, eventually: rather than zeroing the tcache_tls value, it is
being set to 1, which means no tcache is ever allocated, so when we get
around to exiting, the assert trips.
Unfortunately, setting a breakpoint on __libc_allocate_tls seems to do bad
things to the kernel (inducing a SIR without any panic message). I am
somewhat at a loss; help?
Thanks in advance!
--nwf;
[-- Attachment #2 --]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iEYEARECAAYFAk35sVoACgkQTeQabvr9Tc8AJwCfc+etLWF1W7/G4+eMtBqB7RgH
3joAni4+9lulaoiqgiSFFDNXnNXlR8Yj
=kSUy
-----END PGP SIGNATURE-----
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110616073138.GL31996>
