From owner-freebsd-threads@FreeBSD.ORG Mon Jun 16 23:35:42 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B0BDF37B401 for ; Mon, 16 Jun 2003 23:35:42 -0700 (PDT) Received: from stork.mail.pas.earthlink.net (stork.mail.pas.earthlink.net [207.217.120.188]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6B4A043F85 for ; Mon, 16 Jun 2003 23:35:41 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from user-2injaq3.dialup.mindspring.com ([165.121.171.67] helo=mindspring.com) by stork.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 19SA46-0002lL-00; Mon, 16 Jun 2003 23:35:35 -0700 Message-ID: <3EEEB671.2172F5D8@mindspring.com> Date: Mon, 16 Jun 2003 23:34:25 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Alexander Kabaev References: <2D32959E172B8F4D9B02F68266BE421401A6D7E6@mail-sc-3.nvidia.com> <20030616211835.10d0fba4.kabaev@mail.ru> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4fcbf903df156e85a374543cf4f6c2306666fa475841a1c7a350badd9bab72f9c350badd9bab72f9c cc: zander@mail.minion.de cc: Gareth Hughes cc: 'Julian Elischer' cc: threads@freebsd.org cc: 'Daniel Eischen' cc: Andy Ritger Subject: Re: NVIDIA and TLS X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Jun 2003 06:35:43 -0000 Alexander Kabaev wrote: > Gareth Hughes wrote: > > If FreeBSD support ELF TLS and __thread variables in ANY form, our > > driver will use this support. If the best you can do is a function > > call per access, so be it. It doesn't sound like there are any other > > options, given the fact that you ship with three different thread > > libraries. > > Three different libraries can attempt to coordinate and reserve exactly > the same TLS segment size. Gareth's point is that there is a function call to abstract that, unless all approaches use the same mechanism. Right now, he's said that he can't use %fs, because he has a version of WINE that uses OpenGL to implement it's video, and the threads people have said that he can't use %gs because it points to a per-KSE value, not a per thread value, and there are potentially many threads associated with each KSE, so it's no good for TLS, only for KSELS. I've suggested that he use the compiler flags to reserve another *different* register (not %fs and not %gs) for his purpose; the compiler has explicit support for this. If he is wiling to burn a register in order to get this functionality, it should be one that's normally allocated for user programs to burn anyway. Julian has pointed out that the TLS data should probably be cached following a return from a context switch as a result of the user space threads scheduler, if there is one, scheduling the thread to run; so far, it seems that Gareth has mistaken what Julian meant by this (that the reload happen high up in the OpenGL code, using an explicit call, to do the register switch, so that it's available lower down; this would work with the %fs approach, without breaking WINE). In addition, it seems that Gareth's Linux approach is assuming that a thread which is involuntarily context switch will be run again, when the next quantum becomes available. This is an invalid assumption for libkse (though it's currently hacked, locally, by most people to keep non-strict POSIX applications happy), and could be an invalid assumption in Linux and libthr in FreeBSD, should the scheduler code change in the future to recalculate priority prior to giving the quantum to a particular thread. Add to this that the model that Gareth is suggesting as a defacto standard is one advocacted by a company that's sueing IBM over the Linux OS containing uniform code in 60 lines of its scheduler which might very well be related to this publication, for all we know, and the assumptions about implementation implicit in the standard, and there's a significant issue that needs to be resolve here, and I'm not just talking about the legal implications of implementing to a Caldera/SCO specification that they may feel contains their intellectual property. Minimally, we are likely talking a "first call" optimization, at a minimum, in order to set a global offset as valid relative to a register common to all three threads libraries in FreeBSD, and, potentially, two of them, if we *must* go indirect through %gs to get the TLS from the KSELS in %gs. Does that jive with what everyone else understands of thi mailing list thread so far? -- Terry