From owner-freebsd-current@FreeBSD.ORG Thu May 6 11:55:32 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 33D8516A4CE for ; Thu, 6 May 2004 11:55:32 -0700 (PDT) Received: from mail1.speakeasy.net (mail1.speakeasy.net [216.254.0.201]) by mx1.FreeBSD.org (Postfix) with ESMTP id BD5AF43D1D for ; Thu, 6 May 2004 11:55:31 -0700 (PDT) (envelope-from jhb@FreeBSD.org) Received: (qmail 25958 invoked from network); 6 May 2004 18:55:31 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) encrypted SMTP for ; 6 May 2004 18:55:31 -0000 Received: from 10.50.40.205 (gw1.twc.weather.com [216.133.140.1]) by server.baldwin.cx (8.12.11/8.12.11) with ESMTP id i46ItLt1007898; Thu, 6 May 2004 14:55:22 -0400 (EDT) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: freebsd-current@FreeBSD.org Date: Thu, 6 May 2004 14:17:51 -0400 User-Agent: KMail/1.6 References: <20040506150754.GC27139@empiric.dek.spc.org> <20040507031253.Y21938@gamplex.bde.org> In-Reply-To: <20040507031253.Y21938@gamplex.bde.org> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200405061417.51886.jhb@FreeBSD.org> X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on server.baldwin.cx cc: Bruce M Simpson cc: Gerrit Nagelhout cc: Andrew Gallatin Subject: Re: 4.7 vs 5.2.1 SMP/UP bridging performance X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 May 2004 18:55:32 -0000 On Thursday 06 May 2004 01:18 pm, Bruce Evans wrote: > On Thu, 6 May 2004, Bruce M Simpson wrote: > > On Thu, May 06, 2004 at 10:15:44AM -0400, Andrew Gallatin wrote: > > > For what its worth, using those operations yeilds these results > > > on my 2.53GHz P4 (for UP) > > > > > > Mutex (atomic_store_rel_int) cycles per iteration: 208 > > > Mutex (sfence) cycles per iteration: 85 > > > Mutex (lfence) cycles per iteration: 63 > > > Mutex (mfence) cycles per iteration: 169 > > > Mutex (none) cycles per iteration: 18 > > > > > > lfence looks like a winner.. > > > > Please be aware, though, that the different FENCE instructions are acting > > as fences against different things. The NASM documentation has a good > > quick reference for what each of the instructions do, but the definitive > > reference is Intel's IA-32 programmer's reference manuals. > > They are also documented in amd64 manuals. > > Don't they all act as fences only on the same CPU, so they are no help > for SMP? They are still almost twice as slow than full locks on Athlons, > so hopefully they do more. They are a traditional membar like membar on Sparc or acq/rel on ia64. membars only have to apply to the current CPU, but you have to use them in conjunction with a memory address used to implement a lock. Thus, when you acquire a lock, you want to use a lfence to ensure that the CPU won't go past the lfence (assuming lfence is like ia64 acq and sfence is like ia64 rel) for loads. This ensures that you don't read any of the locked values until you have the lock. On release you would use a sfence to prevent any stores from occurring before the store that releases the actual lock. The fence doesn't push out the pending writes to the other CPUs. However, it does mean that another CPU won't see that the lock is released unless it can also see all the other stores before the sfence. Thus, you can actually have a CPU spin waiting for a lock that is already unlocked. I've seen this on my test Alpha (DS20) where CPU0 unlocked sched_lock, CPU1 logged a KTR trace saying it was starting to spin on sched_lock, and a short time later, CPU1 then logged saying it had gotten sched_lock. I'm not sure if *fence is quite that weak. They might be though. Note that each generation of ia32 processors seems to have a weaker memory model than the previous generation. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org