Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 19 Jun 2003 15:36:08 -0700
From:      Marcel Moolenaar <marcel@xcllnt.net>
To:        Julian Elischer <julian@elischer.org>
Cc:        threads@freebsd.org
Subject:   Re: Implementing TLS: step 1
Message-ID:  <20030619223608.GB1273@dhcp01.pn.xcllnt.net>
In-Reply-To: <Pine.BSF.4.21.0306191323250.41210-100000@InterJet.elischer.org>
References:  <20030619202013.GA833@dhcp01.pn.xcllnt.net> <Pine.BSF.4.21.0306191323250.41210-100000@InterJet.elischer.org>

next in thread | previous in thread | raw e-mail | index | archive | help
[sorry, I overlooked the other comments]

> > I have gcc33 installed and looked at the access sequences for TLS
> > on both i386 and ia64. Then I looked at libthr to see what was
> > needed and the first and obvious orbservation is that we need a
> > way to figure out if the binary has a TLS template and use it if
> > it does. If not, we probably need some minimal glue to have the
> > TLS pointer point to something meaningful. Note again, we don't
> > have RTLD involved. We're talking staticly linking now.
> 
> Call me stupid but can you draw a picture of what you mean?
> (it's worth a thoudsand words you know :-)

Not easily. Let me try with words again. Let me know if it's
more clear or not. If not, I'll see if there's a graphical
representation on the net.

When code contains thread local variables (by way of defining them
with the __thread modifier), the compiler will reserve the space
for them in the .tdata section (for initialized data) or the .tbss
section (for uninitialized data or data initialized to zero). This
is exactly like how the compiler reserves space for global data
(using .data and .bss sections), except of course that the intend
of the TLS is that each thread has its own instance.

The linker combines the .tdata and .tbss sections in the same way
it combines the .data and .bss sections. The end result is an
executable (or library) that contains both global data and TLS.
The global data is normally loaded by the kernel at program load
because there's one instance per process. For each thread, the
thread library has to create the TLS instance by copying the TLS
image present in the executable (or constructed by the rtld).
Hence the use of template.

The compiler generates access sequences according to the runtime
specification which in general means that all offsets to the TLS
are based on some TLS base address. On ia64 the thread pointer
points to the TLS and serves as the TLS base address. On other
architectures there may be an indirection. This means that on ia64
the lack of TLS still requires us to allocate something for the
thread pointer to point to. On other architectures this may not be
the case.

A typical access sequence on i386 is:

00000000 <x>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   65 a1 00 00 00 00       mov    %gs:0x0,%eax
   9:   8b 80 00 00 00 00       mov    0x0(%eax),%eax
   f:   c9                      leave
  10:   c3                      ret

At gs:0x0 is the address of the TLS and there's a relocation
associated with the load from %eax:
	RELOCATION RECORDS FOR [.text]:
	OFFSET   TYPE              VALUE
	0000000b R_386_TLS_LE      i

On ia64 the access sequence for this same C code is:

0000000000000000 <x>:
   0:   0b 10 00 1a 00 21       [MMI]       mov r2=r13;;
   6:   e0 00 08 00 48 00                   addl r14=0,r2
   c:   00 00 04 00                         nop.i 0x0;;
  10:   1d 40 00 1c 10 10       [MFB]       ld4 r8=[r14]
  16:   00 00 00 02 00 80                   nop.f 0x0
  1c:   08 00 84 00                         br.ret.sptk.many b0;;

The thread pointer is r13 on ia64. Access to TLS is without
indirection. The relocation is attached to the addl instruction:

	RELOCATION RECORDS FOR [.text]:
	OFFSET           TYPE              VALUE
	0000000000000001 TPREL22           i

> > 1. The kernel already iterates over the program headers and can
> >    pass the address and size of the TLS template to the process
> >    (or RTLD) by means of the auxargs (ie have AT_TLS_ADDR and
> >    AT_TLS_SIZE). If no template exists AT_TLS_* will be zero.
> >    This prevents coding object file dependencies on thread and
> >    allows the RTLD to modify the args even in the event that the
> >    program itself does not have TLS, but libraries in the startup
> >    set do.
> 
> I need to go out to the car and get my copy of the TLS proposal....
> this supports exec-time linking but does it support run-time (i.e after
> exec has begun) linking?

Yes. The rtld will dynamicly construct the TLS template from the
images in the ELF files in the startup set and pass this in
AT_TLS_* by overriding the values (at least that was the idea).

-- 
 Marcel Moolenaar	  USPA: A-39004		 marcel@xcllnt.net



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030619223608.GB1273>