From owner-freebsd-arch Wed May 24 19:18:32 2000 Delivered-To: freebsd-arch@freebsd.org Received: from berserker.bsdi.com (berserker.twistedbit.com [199.79.183.1]) by hub.freebsd.org (Postfix) with ESMTP id E7A4637B9AF for ; Wed, 24 May 2000 19:18:27 -0700 (PDT) (envelope-from cp@berserker.bsdi.com) Received: from berserker.bsdi.com (cp@[127.0.0.1]) by berserker.bsdi.com (8.9.3/8.9.3) with ESMTP id UAA16278; Wed, 24 May 2000 20:18:17 -0600 (MDT) Message-Id: <200005250218.UAA16278@berserker.bsdi.com> To: Matthew Dillon Cc: arch@freebsd.org Subject: Re: Short summary From: Chuck Paterson Date: Wed, 24 May 2000 20:18:17 -0600 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG } Chuck, there was extensive debate and testing on both Linux and } FreeBSD with regards to locked instructions in an SMP environment. } It was determined that there is an optimization one can make which } improves lock performance on SMP systems. } } The jist of the optimization is that if you use a lock prefix when } locking, you do *not* need a lock prefix when unlocking. Write } ordering is guarenteed on Intel (586 or above). This won't work with the BSD/OS locks. The reason is that we use the same word to detect that the someone is waiting for the lock to be released. This works with spins locks kind of (more just ahead) because you don't need to do anything if someone else want the lock you just go ahead and release it. With non spin locks when you release a contested lock you need to go put another process on the run queueu. The "more just head" is address by you ahead actually. } } Also, for recursive locks for the case where you ALREADY hold the lock, } you do not need a lock prefix when incrementing or decrementing the } count. } The BSD/OS mutexs generally use the locked operation and take a miss on the mutex if it is already held, even by the same process. There is a flag to on the mtx_enter/mtx_exit the recursion is likely and that the code should check this before doing the locked operation. By default BSD/OS mutexs are always optimized for the non contested, non-recursed cased. This means that everything is just a cmpxchg and if that wins your done. } the cost of adding a slight delay before contending cpu's see the } change. Since there is no lock contention 99.999% of the time, the } delay is completely absorbed and you realize an increase in performance } across the board. } } The recursion optimization makes recursive locks practical in an SMP } setting. There is virtually *NO* overhead after you've obtained the } initial lock. } } -Matt } To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message