Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 16 Jun 2003 15:00:23 -0700
From:      Gareth Hughes <gareth@nvidia.com>
To:        'Julian Elischer' <julian@elischer.org>, Andy Ritger <ARitger@nvidia.com>
Cc:        Daniel Eischen <eischen@pcnet.com>
Subject:   RE: NVIDIA and TLS
Message-ID:  <2D32959E172B8F4D9B02F68266BE421401A6D7D1@mail-sc-3.nvidia.com>

next in thread | raw e-mail | index | archive | help
On Mon, 16 Jun 2003, Julian Elischer wrote:
> 
> It wouldn't take much for us to give you a "application specific
> pointer" in the memory block that is pointed to by %gs.
> 
> This would allow you to find your data with a single dereference
> for the N:N threads library..
> 
> %gs ----->[threadsystem-thread-specific-data]
>           [ stuff                           ]
>           [ ap-specific pointe              ]---->[ your data ]
>                                                   [           ]
> 
> BUT
> that would only work for the 1:1 threading library

What's wrong with:

%gs ----->[threadsystem-thread-specific-data     ]
          [ stuff                                ]
          [libGL stuff (fixed size, known offset)]

Or, better yet, to make sure no problems arise when you change the internals
of your data structures:

%gs ----->[libGL stuff (16 words, say)      ]
          [threadsystem-thread-specific-data]
          [ stuff                           ]

You reserve the first 16 words of your thread data structure for us, and
we're done.  At least when this library is being used.

> The M:N library (known as KSE) uses an extra level of 
> indirection already. It has a "virtual-CPU" (KSE) that is
> pointer to by the %gs. that in turn knows what thread it is currently
> working for.
> 
> 
> %gs ----->[threadsystem-virtual-cpu-specific-data]
>           [ stuff                                ]        
>           [ current-thread-pointer               ]------\
>                                                         |
>    /----------------------------------------------------/
>    |
>    \----->[threadsystem-thread-specific-data]
>           [ stuff                           ]
>           [ ap-specific pointe              ]---->[ your data ]
>                                                   [           ]
> 
> you couldn't do both with the same code..
> 
> We'd have to (in both libraries) supply you with an entrypoint for a
> function to return the pointer to your address.
> It could be optimised to still be pretty fast, but it's certainly
> non-standard..

We really want to avoid function calls.  What you're describing here is
essentially the ELF TLS mechanism (one flavour, at least).  If you haven't
done so already, you should check out Ulrich Drepper's document.  Keeping
this kind of thing in mind when you're developing the thread libraries is a
good idea -- the ELF TLS mechanisms were designed to be both portable and
fast.  Both Linux and Solaris support __thread variables, with Linux
supporting them on perhaps half a dozen architectures (last time I checked).

>                       This is the problem that you see when you decide
> to use a non-standard manner to do these thing however.
> (By which I mean that you decided to "roll-your-own" but you have made
> yourself non-portable)

I'm not sure what you mean by this.  Do you mean if you provide us with a
magic function to get our thread-local data, we'd be non-portable?  Or that
because we use the ELF TLS stuff we're non-portable already?

-- 
Gareth Hughes (gareth@nvidia.com)
OpenGL Developer, NVIDIA Corporation



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2D32959E172B8F4D9B02F68266BE421401A6D7D1>