From owner-freebsd-threads@FreeBSD.ORG  Mon Jun 16 17:30:14 2003
Return-Path: <owner-freebsd-threads@FreeBSD.ORG>
Delivered-To: freebsd-threads@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 9F94037B401
	for <threads@freebsd.org>; Mon, 16 Jun 2003 17:30:14 -0700 (PDT)
Received: from sccrmhc13.attbi.com (sccrmhc13.comcast.net [204.127.202.64])
	by mx1.FreeBSD.org (Postfix) with ESMTP id DFBB443FA3
	for <threads@freebsd.org>; Mon, 16 Jun 2003 17:30:13 -0700 (PDT)
	(envelope-from julian@elischer.org)
Received: from interjet.elischer.org ([12.233.125.100])
          by attbi.com (sccrmhc13) with ESMTP
          id <20030617000234016001mih4e>; Tue, 17 Jun 2003 00:02:34 +0000
Received: from localhost (localhost.elischer.org [127.0.0.1])
	by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id RAA22696;
	Mon, 16 Jun 2003 17:02:29 -0700 (PDT)
Date: Mon, 16 Jun 2003 17:02:28 -0700 (PDT)
From: Julian Elischer <julian@elischer.org>
To: Gareth Hughes <gareth@nvidia.com>
In-Reply-To: <2D32959E172B8F4D9B02F68266BE421401A6D7E2@mail-sc-3.nvidia.com>
Message-ID: <Pine.BSF.4.21.0306161655400.19977-100000@InterJet.elischer.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
cc: threads@freebsd.org
cc: zander@mail.minion.de
cc: 'Daniel Eischen' <eischen@pcnet.com>
cc: Andy Ritger <ARitger@nvidia.com>
Subject: RE: NVIDIA and TLS
X-BeenThere: freebsd-threads@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Threading on FreeBSD <freebsd-threads.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-threads>,
	<mailto:freebsd-threads-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-threads>
List-Post: <mailto:freebsd-threads@freebsd.org>
List-Help: <mailto:freebsd-threads-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-threads>,
	<mailto:freebsd-threads-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 17 Jun 2003 00:30:15 -0000


On Mon, 16 Jun 2003, Gareth Hughes wrote:

> On Mon, 16 Jun 2003, Julian Elischer wrote:
> > 
> > I think that the problem is that the access method for TLS is dependent
> > on which library is used.
> > 
> > [snip]
> > 
> > The trouble is that each of these would require a differnt mechanism to
> > reach TLS and the compiler cannot know ahead of time which one to use.
> >
> > [snip]
> > 
> > I may be wrong but I don't think it is a standard yet..
> > especailly for the reason that we see here..
> > It requires that the compiler know what threading library is in use.
> > 
> > We could certainly implement efficient TLS code generation for each
> > library, but which one would be compiled in when you compile a .o file
> > that may be used with any library?
> 
> Please read Ulrich Drepper's document, if you haven't done so already.
> You'll see that the general case of __thread variable access involves
> a function call to look up the variables address.  There are
> optimizations to this access model, that allows one of the other three
> models to be used (ranging from a function call the first time a
> __thread variable is accessed, down to a single instruction per
> access).  FreeBSD could trivially implement __thread variables with
> the General Dynamic model (involving a function call per access).
> Our driver uses the Local Exec model (single instruction per access)
> because GNU libc has an optimization on x86 that allows shared
> libraries to use this model, which is normally reserved for
> applications.  The key thing is that they're still __thread variables,
> the access model depends on the compile time options used and what's
> available at runtime.  Please, I urge you to read Drepper's document
> carefully.
> 

I have read most of it already.

What I'm saying is that we can and probably should implement TLS using
the general model to provide TLS at "sane" speed, (e.g.  5 instructions)
but that I don't think we can implement the "1 instruction" version that
you are asking for without breaking the binary compatibility that we
currently have between our 3 pthread libraries. We can currently switch
libraries between 3 very different threads libraries without recompiling
the app or any other libraries involved. in fact we have a config file
to the loader that specifies which one to use at run time **Per
application**. Without using an entrypoint (or maybe self modifying
code) (*EEK!*) I don't see how we can do it and keep that *Very useful*
functionality.