From owner-freebsd-threads@FreeBSD.ORG Fri Apr 11 08:32:28 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CE9E237B401; Fri, 11 Apr 2003 08:32:28 -0700 (PDT) Received: from stork.mail.pas.earthlink.net (stork.mail.pas.earthlink.net [207.217.120.188]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3AE1243FB1; Fri, 11 Apr 2003 08:32:28 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0023.cvx40-bradley.dialup.earthlink.net ([216.244.42.23] helo=mindspring.com) by stork.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 1940Vu-00002w-00; Fri, 11 Apr 2003 08:32:27 -0700 Message-ID: <3E96DFB7.9E5AC29A@mindspring.com> Date: Fri, 11 Apr 2003 08:31:03 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: David Xu References: <20030411053722.782152A7EA@canning.wemm.org> <012401c2ffef$e50657e0$f001a8c0@davidw2k> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4aede999b2b29e7f2ea43533071cb1d8e350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c cc: freebsd-threads@freebsd.org Subject: Re: patch for %gs saving X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Apr 2003 15:32:29 -0000 David Xu wrote: > Yes, I know loading a descriptor is slow, but in real world, such optimization > will be lost in real work noise. And setcontext syscall is broken by this > optimization, userland will fail to set his context in atomic operation, > set pcb->gs = userland_gs does not work, so the cost is obviously, I can optimize > my patch to only save and restore %gs at trap and interrupt time, when entering > kernel, I can always zero %gs because kernel does not use it, I think this can > reduce clock cycles at context switch, this might be better than current code > which loading %gs at every context switch. If it's a threaded program, you are probably going to have to reload the %gs on a contex switch. If it's not, you can lazy bind it, meaning that you only reload it when you go from one threaded process to a different threaded process, and leave it alone otherwise (check its value where you would reload it for a threaded process, in case some other unthreaded process has decided to use it; boils down to a compare-if-different-reload, in both cases). In the UTS, you can also lazy bind the reload, since if you go to sleep and end up rescheduling the thread, it will have the same value as previously. With the kernel support, you will be safe to do this. The one thing that this could result in is, if the process is in the UTS, and another process has changed %gs, such that when you get the quantum from the kernel, it would reload %gs, and then return to user space, and the UTS would again reload %gs to then schedule another user space thread. The only way I can see to avoid this latency would be for the kernel to know the process is in the UTS, before the place where the UTS would reload %gs. If you could know this, then you could avoid the dual reload, by having the kernel *not* reload %gs, even if it's changed, in that case. Thus the UTS becomes compare-if-different-reload as well. You could probably do this with a flag in a mailbox that is associated with the UTS. The "extra" cost is a compare in both places, which (probably) avoids a statistically significant number of %gs reloads. -- Terry