Date: Wed, 24 May 2000 20:22:43 -0700 (PDT) From: Matthew Dillon <dillon@apollo.backplane.com> To: Chuck Paterson <cp@bsdi.com> Cc: arch@freebsd.org Subject: Re: Short summary Message-ID: <200005250322.UAA78579@apollo.backplane.com> References: <200005250218.UAA16278@berserker.bsdi.com>
next in thread | previous in thread | raw e-mail | index | archive | help
:} The jist of the optimization is that if you use a lock prefix when
:} locking, you do *not* need a lock prefix when unlocking. Write
:} ordering is guarenteed on Intel (586 or above).
:
: This won't work with the BSD/OS locks. The reason is that
: we use the same word to detect that the someone is waiting
: for the lock to be released. This works with spins
: locks kind of (more just ahead) because you don't
: need to do anything if someone else want the lock
: you just go ahead and release it. With non spin
: locks when you release a contested lock you need
: to go put another process on the run queueu.
Ouch, having the contending cpu actually do a locked write
to the lock (i.e. cache line) held by another cpu is really,
really slow. Both processors will eat the full overhead of
the hardware cache coherency protocol - It's about 3 times
as expensive as a contended lock without the ping-pong writing
and about twice as expensive as a non-contending lock,
and recursive locks using this model will be about 5x as expensive
even in the best case.
If there is any way to avoid this, I would avoid this.
: The "more just head" is address by you ahead actually.
:}
:} Also, for recursive locks for the case where you ALREADY hold the lock,
:} you do not need a lock prefix when incrementing or decrementing the
:} count.
:}
:
: The BSD/OS mutexs generally use the locked operation and
: take a miss on the mutex if it is already held, even
: by the same process. There is a flag to on the
: mtx_enter/mtx_exit the recursion is likely and
: that the code should check this before doing the
: locked operation.
:
: By default BSD/OS mutexs are always optimized for the
: non contested, non-recursed cased. This means that
: everything is just a cmpxchg and if that wins your
: done.
If you can get rid of the contending-cpu-writes-to-the-lock
case, your best case recursion code will be about 5 times
faster in the recursion case and your best case non-contending
non-recursive lock case will be about twice as fast.
-Matt
Matthew Dillon
<dillon@backplane.com>
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200005250322.UAA78579>
