From owner-freebsd-arch  Wed May 24 19:18:32 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from berserker.bsdi.com (berserker.twistedbit.com [199.79.183.1])
	by hub.freebsd.org (Postfix) with ESMTP id E7A4637B9AF
	for <arch@freebsd.org>; Wed, 24 May 2000 19:18:27 -0700 (PDT)
	(envelope-from cp@berserker.bsdi.com)
Received: from berserker.bsdi.com (cp@[127.0.0.1])
	by berserker.bsdi.com (8.9.3/8.9.3) with ESMTP id UAA16278;
	Wed, 24 May 2000 20:18:17 -0600 (MDT)
Message-Id: <200005250218.UAA16278@berserker.bsdi.com>
To: Matthew Dillon <dillon@apollo.backplane.com>
Cc: arch@freebsd.org
Subject: Re: Short summary 
From: Chuck Paterson <cp@bsdi.com>
Date: Wed, 24 May 2000 20:18:17 -0600
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


}    Chuck, there was extensive debate and testing on both Linux and
}    FreeBSD with regards to locked instructions in an SMP environment.
}    It was determined that there is an optimization one can make which
}    improves lock performance on SMP systems.
}


}    The jist of the optimization is that if you use a lock prefix when
}    locking, you do *not* need a lock prefix when unlocking.  Write 
}    ordering is guarenteed on Intel (586 or above).

	This won't work with the BSD/OS locks. The reason is that
	we use the same word to detect that the someone is waiting
	for the lock to be released. This works with spins
	locks kind of (more just ahead) because you don't
	need to do anything if someone else want the lock
	you just go ahead and release it. With non spin
	locks when you release a contested lock you need
	to go put another process on the run queueu. 

	The "more just head" is address by you ahead actually.
}
}    Also, for recursive locks for the case where you ALREADY hold the lock,
}    you do not need a lock prefix when incrementing or decrementing the
}    count.
}

	The BSD/OS mutexs generally use the locked operation and
	take a miss on the mutex if it is already held, even
	by the same process. There is a flag to on the
	mtx_enter/mtx_exit the recursion is likely and
	that the code should check this before doing the
	locked operation. 

	By default BSD/OS mutexs are always optimized for the
	non contested, non-recursed cased. This means that
	everything is just a cmpxchg and if that wins your
	done.

}    the cost of adding a slight delay before contending cpu's see the 
}    change.  Since there is no lock contention 99.999% of the time, the
}    delay is completely absorbed and you realize an increase in performance
}    across the board.
}

}    The recursion optimization makes recursive locks practical in an SMP 
}    setting.  There is virtually *NO* overhead after you've obtained the
}    initial lock.
}
}						-Matt
}


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message