From owner-cvs-all@FreeBSD.ORG Fri Nov 5 12:02:11 2004 Return-Path: Delivered-To: cvs-all@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A900E16A4CE; Fri, 5 Nov 2004 12:02:11 +0000 (GMT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 369A743D46; Fri, 5 Nov 2004 12:02:11 +0000 (GMT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.13.1/8.13.1) with ESMTP id iA5C1EoL000381; Fri, 5 Nov 2004 07:01:14 -0500 (EST) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)iA5C1EWC000378; Fri, 5 Nov 2004 12:01:14 GMT (envelope-from robert@fledge.watson.org) Date: Fri, 5 Nov 2004 12:01:14 +0000 (GMT) From: Robert Watson X-Sender: robert@fledge.watson.org To: Mike Silbersack In-Reply-To: <20041029174131.A6530@odysseus.silby.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: Alan Cox cc: cvs-src@FreeBSD.org cc: src-committers@FreeBSD.org cc: cvs-all@FreeBSD.org cc: John Baldwin Subject: Re: cvs commit: src/sys/i386/i386 pmap.c X-BeenThere: cvs-all@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: CVS commit messages for the entire tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Nov 2004 12:02:11 -0000 On Fri, 29 Oct 2004, Mike Silbersack wrote: > I think we really need some sort of light-weight critical_enter that > simply assures you that you won't get rescheduled to another CPU, but > gives no guarantees beyond that. > Er, wait - I guess I'm forgetting something, there exists the potential > for the interrupt that preempted whatever was calling arc4random to also > call arc4random, thereby breaking things... I've been looking at related issues for the last couple of days and must have missed this thread while at EuroBSDCon. Alan Cox pointed me at it, so here I am. :-) Right now, the cost of acquiring and dropping an uncontended a sleep mutex on a UP kernel is very low -- about 21 cycles on my PIII and 40 on my P4, including some efficiency problems in my measurement which probably add a non-trivial overhead. Compare this with the SMP versions on the PIII (90 cycles) and P4 (260 cycles!). Critical sections on the SMP PIII are about the same cost as the SMP mutex, but on the P4 a critical section is less than half the cost. Getting to a model where critical sections were as cheap as UP sleep mutexes, or where we could use a similar combination of primitives (such as UP mutexes with pinning) would be very useful. Otherwise, optimizing through use of critical sections will improve SMP but potentially damage performance on UP. There's been a fair amount of discussion of such approaches, including the implementation briefly present in the FreeBSD. I know John Baldwin and Justin Gibbs both have theories and plans in this area. If we do create a UP mutex primitive for use on SMP, I would suggest we actually expand the contents of the UP mutex structure slightly to include a cpu number that can be asserted, along with pinning, when an operation is attempted and INVARIANTS is present. One of the great strengths of the mutex/lock model is a strong assertion capability, both for the purposes of documentation and testing, so we should make sure that carries into any new synchronization primitives. Small table of synchronization primitives below; in each case, the count is in cycles and reflects the cost of acquiring and dropping the primitive (lock+unlock, enter+exit). The P4 is a 3ghz box, and the PIII is an 800mhz box. Note that the synchronization primitives requiring atomic operations are substantially pessimized on the P4 vs the PIII. A discussion with John Baldwin and Scott Long yesterday revealed that the UP spin mutex is currently pessimized from a critical section to a critical section plus mutex internals due to a need for mtx_owned() on spin locks. I'm not convinced that explains the entire performance irregularity I see for P4 spin mutexes on UP, however. Note that 39 (P4 UP sleep mutex) + 120 (P4 UP critical section) is not 274 (P4 UP spin mutex) by a fair amount. Figuring out what's going on there would be a good idea, although it could well be a property of my measurement environment. I'm currently using this to do measurements: //depot/user/rwatson/percpu/sys/test/test_synch_timing.c In all of the below, the standard deviation is very small if you're careful about not bumping into hard clock or other interrupts during testing, especially when it comes to spin mutexes and critical sections. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Principal Research Scientist, McAfee Research sleep mutex crit section spin mutex UP SMP UP SMP UP SMP PIII 21 90 83 81 112 141 P4 39 260 120 119 274 342