From owner-freebsd-threads@FreeBSD.ORG Mon Jun 16 14:41:07 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8E35637B401 for ; Mon, 16 Jun 2003 14:41:07 -0700 (PDT) Received: from hqemgate00.nvidia.com (hqemgate00.nvidia.com [216.228.112.144]) by mx1.FreeBSD.org (Postfix) with ESMTP id 033DE43F3F for ; Mon, 16 Jun 2003 14:41:05 -0700 (PDT) (envelope-from gareth@nvidia.com) Received: from mail-sc-0.nvidia.com (Not Verified[172.16.217.105]) id ; Mon, 16 Jun 2003 14:44:24 -0700 Received: by mail-sc-0.nvidia.com with Internet Mail Service (5.5.2653.19) id ; Mon, 16 Jun 2003 14:41:03 -0700 Message-ID: <2D32959E172B8F4D9B02F68266BE421401A6D7CD@mail-sc-3.nvidia.com> From: Gareth Hughes To: Andy Ritger , Daniel Eischen Date: Mon, 16 Jun 2003 14:41:01 -0700 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: text/plain cc: threads@freebsd.org cc: zander@mail.minion.de Subject: RE: NVIDIA and TLS X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Jun 2003 21:41:07 -0000 On Mon, 16 Jun 2003, Andy Ritger wrote: > > So from an OpenGL point of view, here are several alternatives that > I see for atleast the near term: > > - make NVIDIA's OpenGL implementation not thread-safe (just > use global data rather that thread-local data) > > - accept the performance hit of using pthread_getspecific() > on FreeBSD. From talking to other OpenGL engineers, > conservative estimates of the performance impact on > applications like viewperf range from 10% - 15%. I'd like > to quantify that, but certainly there will be a performance > penalty. And these are *very* conservative estimates -- you're essentially adding a function call into a path that is, on average, less than ten instructions per OpenGL API call, where the number of API calls per frame is upward of 3 million (3 calls per vertex, over a million vertices for some Viewperf benchmarks). The API was designed this way for a reason, and fast thread-local storage is a fundamental part of a high performance implementation. -- Gareth Hughes (gareth@nvidia.com) OpenGL Developer, NVIDIA Corporation