From owner-freebsd-threads@FreeBSD.ORG Wed Jun 18 02:07:55 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 823F837B401 for ; Wed, 18 Jun 2003 02:07:55 -0700 (PDT) Received: from bluejay.mail.pas.earthlink.net (bluejay.mail.pas.earthlink.net [207.217.120.218]) by mx1.FreeBSD.org (Postfix) with ESMTP id E1A6143FBD for ; Wed, 18 Jun 2003 02:07:54 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from user-2ivfl1t.dialup.mindspring.com ([165.247.212.61] helo=mindspring.com) by bluejay.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 19SYuq-00066A-00; Wed, 18 Jun 2003 02:07:41 -0700 Message-ID: <3EF02B40.A4BD1EF@mindspring.com> Date: Wed, 18 Jun 2003 02:05:04 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Marcel Moolenaar References: <20030617223910.GB57040@ns1.xcllnt.net> <20030618003556.GA2440@dhcp01.pn.xcllnt.net> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4295de0da9206727919bfdccebff7ee12a2d4e88014a4647c350badd9bab72f9c350badd9bab72f9c cc: David Xu cc: Julian Elischer cc: threads@freebsd.org Subject: Re: Nvidia, TLS and __thread keyword -- an observation X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Jun 2003 09:07:55 -0000 Marcel Moolenaar wrote: > On Wed, Jun 18, 2003 at 07:48:09AM +0800, David Xu wrote: > > I believe this will add overhead to thread creating and destroying, > > How fast an RTLD can be in this case ? > > In the dynamic TLS model you would like to delay the creation of > the TLS space. Normally __tls_get_addr() gets used for this. In > the static TLS model you allocate the TLS when you llocate the > thread control structure. Lazy binding in this context doesn't make a lot of sense. In the case of a dynamically linked binary, every thread will need the context sooner or later, unless you are running a heterogeneous workload; even then, the locking required at access time would be prohibitive, compared to binding it once at object load time. For modules (e.g. Apache CGI's dlopen'ed and loaded in), the same is true: you will run them eventually, when your thread ends up with a request, unless you do special dancing around creating "uncontaminated" vs. "contaminated" thread pools, and reluctantly migrate from the former to the latter -- i.e.: make special effort to keep threads uncontaminated by having had a particular dynamic object's TLS attached to it. Garbage collection would be ugly, the locking overhead would be just as great, trying to keep the TLS sections dynamically bound, and you would have much more overhead on thread teardown. Forget thread_join()! The code for the poor applicaiton to keep track of this would very much exceed the overhead of the application just passing around a context to the reentrant functions. > Thus, there's virtually no cost. However TLS accesses for the > dynamic TLS model are expensive. I have some ideas about that. > With some kernel support you can even create dynamic TLS with > static TLS code sequences... I think you can only do this if you do *not* lazy-bind, and the refrences are at fixed offsets. This means orphaning sections of TLS data any time you unload a dynamic module that formerly referenced it. Much better to make thread attach/detach as explicit as process attach/detach already is, with the .init/.fini sections that are there for the process, create corresponding ones for the threads. -- Terry