Date: Wed, 24 May 2000 20:22:43 -0700 (PDT) From: Matthew Dillon <dillon@apollo.backplane.com> To: Chuck Paterson <cp@bsdi.com> Cc: arch@freebsd.org Subject: Re: Short summary Message-ID: <200005250322.UAA78579@apollo.backplane.com> References: <200005250218.UAA16278@berserker.bsdi.com>
next in thread | previous in thread | raw e-mail | index | archive | help
:} The jist of the optimization is that if you use a lock prefix when :} locking, you do *not* need a lock prefix when unlocking. Write :} ordering is guarenteed on Intel (586 or above). : : This won't work with the BSD/OS locks. The reason is that : we use the same word to detect that the someone is waiting : for the lock to be released. This works with spins : locks kind of (more just ahead) because you don't : need to do anything if someone else want the lock : you just go ahead and release it. With non spin : locks when you release a contested lock you need : to go put another process on the run queueu. Ouch, having the contending cpu actually do a locked write to the lock (i.e. cache line) held by another cpu is really, really slow. Both processors will eat the full overhead of the hardware cache coherency protocol - It's about 3 times as expensive as a contended lock without the ping-pong writing and about twice as expensive as a non-contending lock, and recursive locks using this model will be about 5x as expensive even in the best case. If there is any way to avoid this, I would avoid this. : The "more just head" is address by you ahead actually. :} :} Also, for recursive locks for the case where you ALREADY hold the lock, :} you do not need a lock prefix when incrementing or decrementing the :} count. :} : : The BSD/OS mutexs generally use the locked operation and : take a miss on the mutex if it is already held, even : by the same process. There is a flag to on the : mtx_enter/mtx_exit the recursion is likely and : that the code should check this before doing the : locked operation. : : By default BSD/OS mutexs are always optimized for the : non contested, non-recursed cased. This means that : everything is just a cmpxchg and if that wins your : done. If you can get rid of the contending-cpu-writes-to-the-lock case, your best case recursion code will be about 5 times faster in the recursion case and your best case non-contending non-recursive lock case will be about twice as fast. -Matt Matthew Dillon <dillon@backplane.com> To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200005250322.UAA78579>