From owner-cvs-src@FreeBSD.ORG Sat Nov 6 22:24:16 2004 Return-Path: Delivered-To: cvs-src@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2D00D16A4CE for ; Sat, 6 Nov 2004 22:24:16 +0000 (GMT) Received: from duchess.speedfactory.net (duchess.speedfactory.net [66.23.201.84]) by mx1.FreeBSD.org (Postfix) with SMTP id 590FA43D5A for ; Sat, 6 Nov 2004 22:24:15 +0000 (GMT) (envelope-from ups@tree.com) Received: (qmail 16322 invoked by uid 89); 6 Nov 2004 22:24:14 -0000 Received: from duchess.speedfactory.net (66.23.201.84) by duchess.speedfactory.net with SMTP; 6 Nov 2004 22:24:14 -0000 Received: (qmail 16290 invoked by uid 89); 6 Nov 2004 22:24:13 -0000 Received: from unknown (HELO palm.tree.com) (66.23.216.49) by duchess.speedfactory.net with SMTP; 6 Nov 2004 22:24:13 -0000 Received: from [127.0.0.1] (localhost.tree.com [127.0.0.1]) by palm.tree.com (8.12.10/8.12.10) with ESMTP id iA6MOC5R014080; Sat, 6 Nov 2004 17:24:12 -0500 (EST) (envelope-from ups@tree.com) From: Stephan Uphoff To: Robert Watson In-Reply-To: References: Content-Type: text/plain Message-Id: <1099779852.8097.68.camel@palm.tree.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 Date: Sat, 06 Nov 2004 17:24:12 -0500 Content-Transfer-Encoding: 7bit cc: src-committers@FreeBSD.org cc: John Baldwin cc: Alan Cox cc: cvs-src@FreeBSD.org cc: Mike Silbersack cc: cvs-all@FreeBSD.org Subject: Re: cvs commit: src/sys/i386/i386 pmap.c X-BeenThere: cvs-src@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: CVS commit messages for the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Nov 2004 22:24:16 -0000 On Fri, 2004-11-05 at 07:01, Robert Watson wrote: > On Fri, 29 Oct 2004, Mike Silbersack wrote: > > > I think we really need some sort of light-weight critical_enter that > > simply assures you that you won't get rescheduled to another CPU, but > > gives no guarantees beyond that. > > > Er, wait - I guess I'm forgetting something, there exists the potential > > for the interrupt that preempted whatever was calling arc4random to also > > call arc4random, thereby breaking things... > > I've been looking at related issues for the last couple of days and must > have missed this thread while at EuroBSDCon. Alan Cox pointed me at it, > so here I am. :-) > > Right now, the cost of acquiring and dropping an uncontended a sleep mutex > on a UP kernel is very low -- about 21 cycles on my PIII and 40 on my P4, > including some efficiency problems in my measurement which probably add a > non-trivial overhead. Compare this with the SMP versions on the PIII (90 > cycles) and P4 (260 cycles!). Critical sections on the SMP PIII are about > the same cost as the SMP mutex, but on the P4 a critical section is less > than half the cost. Getting to a model where critical sections were as > cheap as UP sleep mutexes, or where we could use a similar combination of > primitives (such as UP mutexes with pinning) would be very useful. > Otherwise, optimizing through use of critical sections will improve SMP > but potentially damage performance on UP. There's been a fair amount of > discussion of such approaches, including the implementation briefly > present in the FreeBSD. I know John Baldwin and Justin Gibbs both have > theories and plans in this area. > > If we do create a UP mutex primitive for use on SMP, I would suggest we > actually expand the contents of the UP mutex structure slightly to include > a cpu number that can be asserted, along with pinning, when an operation > is attempted and INVARIANTS is present. One of the great strengths of the > mutex/lock model is a strong assertion capability, both for the purposes > of documentation and testing, so we should make sure that carries into any > new synchronization primitives. > > Small table of synchronization primitives below; in each case, the count > is in cycles and reflects the cost of acquiring and dropping the primitive > (lock+unlock, enter+exit). The P4 is a 3ghz box, and the PIII is an > 800mhz box. Note that the synchronization primitives requiring atomic > operations are substantially pessimized on the P4 vs the PIII. > > A discussion with John Baldwin and Scott Long yesterday revealed that the > UP spin mutex is currently pessimized from a critical section to a > critical section plus mutex internals due to a need for mtx_owned() on > spin locks. I'm not convinced that explains the entire performance > irregularity I see for P4 spin mutexes on UP, however. Note that 39 (P4 > UP sleep mutex) + 120 (P4 UP critical section) is not 274 (P4 UP spin > mutex) by a fair amount. Figuring out what's going on there would be a > good idea, although it could well be a property of my measurement > environment. I'm currently using this to do measurements: > > //depot/user/rwatson/percpu/sys/test/test_synch_timing.c > > In all of the below, the standard deviation is very small if you're > careful about not bumping into hard clock or other interrupts during > testing, especially when it comes to spin mutexes and critical sections. > > Robert N M Watson FreeBSD Core Team, TrustedBSD Projects > robert@fledge.watson.org Principal Research Scientist, McAfee Research > > sleep mutex crit section spin mutex > UP SMP UP SMP UP SMP > PIII 21 90 83 81 112 141 > P4 39 260 120 119 274 342 Nice catch! On a UP releasing a spin mutex involves a xchgl operation while releasing an uncontested sleep mutex uses cmpxchgl. Since the xchgl does an implicit LOCK (and cmpxchgl does NOT) this could explain why the spin mutex needs a lot more cycles. This should be easy to fix since the xchgl is not needed on a UP system. Right now I am sick and don't trust my own code so I won't write a patch for the next few days ... hopefully someone else can get to it first. Stephan