From owner-freebsd-threads@FreeBSD.ORG Thu Jun 19 23:32:15 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9D70937B401 for ; Thu, 19 Jun 2003 23:32:15 -0700 (PDT) Received: from heron.mail.pas.earthlink.net (heron.mail.pas.earthlink.net [207.217.120.189]) by mx1.FreeBSD.org (Postfix) with ESMTP id DFD7D43FA3 for ; Thu, 19 Jun 2003 23:32:14 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from user-uinj93o.dialup.mindspring.com ([165.121.164.120] helo=mindspring.com) by heron.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 19TFRN-0001Y2-00; Thu, 19 Jun 2003 23:32:06 -0700 Message-ID: <3EF2AA23.89CCD315@mindspring.com> Date: Thu, 19 Jun 2003 23:30:59 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Marcel Moolenaar References: <20030619202013.GA833@dhcp01.pn.xcllnt.net> <20030619223608.GB1273@dhcp01.pn.xcllnt.net> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4e550267a68a927fc0c6f81d80ede8f58350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c cc: threads@freebsd.org cc: Julian Elischer Subject: Re: Implementing TLS: step 1 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Jun 2003 06:32:15 -0000 [ ... hacked up Marcel's text in the post a bit ... ] Marcel Moolenaar wrote: > > Call me stupid but can you draw a picture of what you mean? > > (it's worth a thoudsand words you know :-) > > Not easily. Let me try with words again. Let me know if it's > more clear or not. If not, I'll see if there's a graphical > representation on the net. [ ... ] > The linker combines the .tdata and .tbss sections in the same way > it combines the .data and .bss sections. The end result is an > executable (or library) that contains both global data and TLS. ---------------------------------------- Program image file on disk ---------------------------------------- Ordinary TLS enabled ,-------. ,-------. | code | | code | `-------' `-------' ,-------. ,-------. | data | | data | `-------' `-------' ,-------. ,-------. | bss | | bss | `-------' `-------' ,-------. | tbss | `-------' ,-------. | tdata | `-------' ---------------------------------------- > When code contains thread local variables (by way of defining them > with the __thread modifier), the compiler will reserve the space > for them in the .tdata section (for initialized data) or the .tbss > section (for uninitialized data or data initialized to zero). This > is exactly like how the compiler reserves space for global data > (using .data and .bss sections), except of course that the intend > of the TLS is that each thread has its own instance. ------------------------------------------------------------ How the "__thread" C language extension works ------------------------------------------------------------ declaration where it ends up int foo1; ---> bss int foo2 = 37; ---> data __thread int foo3; ---> tbss __thread int foo4 = 37; ---> tdata ------------------------------------------------------------ > The global data is normally loaded by the kernel at program load > because there's one instance per process. For each thread, the > thread library has to create the TLS instance by copying the TLS > image present in the executable (or constructed by the rtld). > Hence the use of template. ------------------------------------------------------------ How things look when loaded into memory ------------------------------------------------------------ Thread #1 Disk Image Thread #N ,-------. code----reference---->| code |<----reference---code `-------' ,-------. data----reference---->| data |<----reference---data `-------' ,-------. bss----reference---->| bss |<----reference----bss `-------' ,-------. ,-------. ,-------. | tdata |<-----copy-----| tdata |-----copy----->| tdata | (templated) `-------' `-------' `-------' ,-------. ,-------. .-------. | tbss |<-----copy-----| tbss |-----copy----->| tbss | (templated) `-------' `-------' `-------' ------------------------------------------------------------ > The compiler generates access sequences according to the runtime > specification which in general means that all offsets to the TLS > are based on some TLS base address. On ia64 the thread pointer > points to the TLS and serves as the TLS base address. On other > architectures there may be an indirection. This means that on ia64 > the lack of TLS still requires us to allocate something for the > thread pointer to point to. On other architectures this may not be > the case. Implementation defined access mechanisms are outside the scope of this discussions, since they have not yet been selected. But the above means that the compiler "magically" knows to implement code to reference "__thread" attributed data through the locally defined access mechanism, whatever that may be. Note(1): I have no idea how this applies to things like function pointers with this attribute pointed to functions without it; I assume it will "do the right thing", and make seperate data elements for the pointers, as directed, *AND* generate code to make the calls relative to the TLS for the active thread, which could make the implementeion very complicated. Note(2): For external global references, one would assume that there are scoping issues, i.e. that the external declaration with the "__thread" qualifier language extension *MUST* be in scope at the time, or, at bes, the symbol decorations will not match, or, at worst, everyone who references an out of scope variable like this, or, if forced to have a reference in scope, the reference fails to also have the "__thread" qualifier, they would get the first thread's instance... or even worse, the template instance. > > I need to go out to the car and get my copy of the TLS proposal.... > > this supports exec-time linking but does it support run-time (i.e after > > exec has begun) linking? > > Yes. The rtld will dynamicly construct the TLS template from the > images in the ELF files in the startup set and pass this in > AT_TLS_* by overriding the values (at least that was the idea). This is where I personally have a problem with lazy intialization of per thread TLS. Specifically, when a thread exits, you have to know what you have and have not instanced, on a per dynamic object, per thread basis, as a minimum granularity, in order to be able to clean it up, without trying to clean up things you have not yet instanced in that particular thread. This strikes me as being unable to use the %gs "single instruction" shortcuts, which means that code generation for a dynamically linked object module would nee to know, _apriori_, what kind of references it needed to be generating, OR *all* references would have to be via function and pointer indirection... meaning that the "single instruction" optimization is an illusion that can never happen in reality. -- Terry