From owner-freebsd-threads@FreeBSD.ORG Mon Jun 16 22:28:14 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5043737B401 for ; Mon, 16 Jun 2003 22:28:14 -0700 (PDT) Received: from heron.mail.pas.earthlink.net (heron.mail.pas.earthlink.net [207.217.120.189]) by mx1.FreeBSD.org (Postfix) with ESMTP id AB20443F3F for ; Mon, 16 Jun 2003 22:28:13 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from user-2injaq3.dialup.mindspring.com ([165.121.171.67] helo=mindspring.com) by heron.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 19S90t-0006ny-00; Mon, 16 Jun 2003 22:28:12 -0700 Message-ID: <3EEEA6A9.5F8D60C3@mindspring.com> Date: Mon, 16 Jun 2003 22:27:05 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Gareth Hughes References: <2D32959E172B8F4D9B02F68266BE421401A6D7CD@mail-sc-3.nvidia.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a40317d7a3f931e2f42af2114c3122eab0a8438e0f32a48e08350badd9bab72f9c350badd9bab72f9c cc: threads@freebsd.org cc: zander@mail.minion.de cc: Daniel Eischen cc: Andy Ritger Subject: Re: NVIDIA and TLS X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Jun 2003 05:28:14 -0000 Gareth Hughes wrote: > On Mon, 16 Jun 2003, Andy Ritger wrote: > > So from an OpenGL point of view, here are several alternatives that > > I see for atleast the near term: > > > > - make NVIDIA's OpenGL implementation not thread-safe (just > > use global data rather that thread-local data) > > > > - accept the performance hit of using pthread_getspecific() > > on FreeBSD. From talking to other OpenGL engineers, > > conservative estimates of the performance impact on > > applications like viewperf range from 10% - 15%. I'd like > > to quantify that, but certainly there will be a performance > > penalty. > > And these are *very* conservative estimates -- you're essentially adding a > function call into a path that is, on average, less than ten instructions > per OpenGL API call, where the number of API calls per frame is upward of 3 > million (3 calls per vertex, over a million vertices for some Viewperf > benchmarks). The API was designed this way for a reason, and fast > thread-local storage is a fundamental part of a high performance > implementation. What do you do on systems where you can't grab the %gs register and use it for whatever you want, because it's in use for something else? The pthread_getspecific() could probably be made into an inline, at the very least, and could potentially be made to lazy-bind %gs to the evil use to which it's currently being put. -- Terry