From owner-freebsd-threads@FreeBSD.ORG  Mon Jun 16 15:01:00 2003
Return-Path: <owner-freebsd-threads@FreeBSD.ORG>
Delivered-To: freebsd-threads@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 0208B37B401
	for <threads@freebsd.org>; Mon, 16 Jun 2003 15:01:00 -0700 (PDT)
Received: from hqemgate00.nvidia.com (hqemgate00.nvidia.com [216.228.112.144])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 8A15B43F3F
	for <threads@freebsd.org>; Mon, 16 Jun 2003 15:00:58 -0700 (PDT)
	(envelope-from gareth@nvidia.com)
Received: from mail-sc-0.nvidia.com (Not Verified[172.16.217.105])
	id <BA00300212>; Mon, 16 Jun 2003 15:03:45 -0700
Received: by mail-sc-0.nvidia.com with Internet Mail Service (5.5.2653.19)
	id <MJR3VYXM>; Mon, 16 Jun 2003 15:00:24 -0700
Message-ID: <2D32959E172B8F4D9B02F68266BE421401A6D7D1@mail-sc-3.nvidia.com>
From: Gareth Hughes <gareth@nvidia.com>
To: 'Julian Elischer' <julian@elischer.org>,
	Andy Ritger <ARitger@nvidia.com>
Date: Mon, 16 Jun 2003 15:00:23 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain
cc: threads@freebsd.org
cc: zander@mail.minion.de
cc: Daniel Eischen <eischen@pcnet.com>
Subject: RE: NVIDIA and TLS
X-BeenThere: freebsd-threads@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Threading on FreeBSD <freebsd-threads.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-threads>,
	<mailto:freebsd-threads-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-threads>
List-Post: <mailto:freebsd-threads@freebsd.org>
List-Help: <mailto:freebsd-threads-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-threads>,
	<mailto:freebsd-threads-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 16 Jun 2003 22:01:00 -0000

On Mon, 16 Jun 2003, Julian Elischer wrote:
> 
> It wouldn't take much for us to give you a "application specific
> pointer" in the memory block that is pointed to by %gs.
> 
> This would allow you to find your data with a single dereference
> for the N:N threads library..
> 
> %gs ----->[threadsystem-thread-specific-data]
>           [ stuff                           ]
>           [ ap-specific pointe              ]---->[ your data ]
>                                                   [           ]
> 
> BUT
> that would only work for the 1:1 threading library

What's wrong with:

%gs ----->[threadsystem-thread-specific-data     ]
          [ stuff                                ]
          [libGL stuff (fixed size, known offset)]

Or, better yet, to make sure no problems arise when you change the internals
of your data structures:

%gs ----->[libGL stuff (16 words, say)      ]
          [threadsystem-thread-specific-data]
          [ stuff                           ]

You reserve the first 16 words of your thread data structure for us, and
we're done.  At least when this library is being used.

> The M:N library (known as KSE) uses an extra level of 
> indirection already. It has a "virtual-CPU" (KSE) that is
> pointer to by the %gs. that in turn knows what thread it is currently
> working for.
> 
> 
> %gs ----->[threadsystem-virtual-cpu-specific-data]
>           [ stuff                                ]        
>           [ current-thread-pointer               ]------\
>                                                         |
>    /----------------------------------------------------/
>    |
>    \----->[threadsystem-thread-specific-data]
>           [ stuff                           ]
>           [ ap-specific pointe              ]---->[ your data ]
>                                                   [           ]
> 
> you couldn't do both with the same code..
> 
> We'd have to (in both libraries) supply you with an entrypoint for a
> function to return the pointer to your address.
> It could be optimised to still be pretty fast, but it's certainly
> non-standard..

We really want to avoid function calls.  What you're describing here is
essentially the ELF TLS mechanism (one flavour, at least).  If you haven't
done so already, you should check out Ulrich Drepper's document.  Keeping
this kind of thing in mind when you're developing the thread libraries is a
good idea -- the ELF TLS mechanisms were designed to be both portable and
fast.  Both Linux and Solaris support __thread variables, with Linux
supporting them on perhaps half a dozen architectures (last time I checked).

>                       This is the problem that you see when you decide
> to use a non-standard manner to do these thing however.
> (By which I mean that you decided to "roll-your-own" but you have made
> yourself non-portable)

I'm not sure what you mean by this.  Do you mean if you provide us with a
magic function to get our thread-local data, we'd be non-portable?  Or that
because we use the ELF TLS stuff we're non-portable already?

-- 
Gareth Hughes (gareth@nvidia.com)
OpenGL Developer, NVIDIA Corporation