From owner-freebsd-current@FreeBSD.ORG Thu Aug 5 22:01:27 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3144716A530 for ; Thu, 5 Aug 2004 22:01:27 +0000 (GMT) Received: from mail4.speakeasy.net (mail4.speakeasy.net [216.254.0.204]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9EBCD43D54 for ; Thu, 5 Aug 2004 22:01:26 +0000 (GMT) (envelope-from jhb@FreeBSD.org) Received: (qmail 32397 invoked from network); 5 Aug 2004 22:01:26 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) encrypted SMTP for ; 5 Aug 2004 22:01:24 -0000 Received: from 10.50.40.208 (gw1.twc.weather.com [216.133.140.1]) (authenticated bits=0) by server.baldwin.cx (8.12.11/8.12.11) with ESMTP id i75M0vUa044456; Thu, 5 Aug 2004 18:01:21 -0400 (EDT) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: freebsd-current@FreeBSD.org Date: Thu, 5 Aug 2004 17:59:53 -0400 User-Agent: KMail/1.6 References: <20040805050422.GA41201@cat.robbins.dropbear.id.au> In-Reply-To: <20040805050422.GA41201@cat.robbins.dropbear.id.au> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200408051759.53079.jhb@FreeBSD.org> X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on server.baldwin.cx cc: Tim Robbins Subject: Re: Atomic operations on i386/amd64 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Aug 2004 22:01:27 -0000 On Thursday 05 August 2004 01:04 am, Tim Robbins wrote: > Is there any particular reason why atomic_load_acq_*() and > atomic_store_rel_*() are implemented with CMPXCHG and XCHG instead of > MOV on i386/amd64 UP? Actually, using mov instead of lock xchg for store_rel reduced performance in some benchmarks Scott ran on an SMP machine, I'm guessing due to the higher latency of locks becoming available to other CPUs. I'm still waiting for benchmark results on UP to see if the change should be made under #ifndef SMP or some such. > Also, could we use MFENCE/LFENCE/SFENCE in combination with MOV on > SMP systems instead of LOCK CMPXCHG / (implied LOCK) XCHG? MFENCE and LFENCE only exist on the P4. SFENCE only exists on P3+, so to do so you'd lose the ability to run on PII's and earlier. Also, if you use more than SFENCE you lose PIII's. Note that amd64 could probably be changed though since they might all have fences, in which case that might be something to benchmark on both UP and SMP to see what kind of difference it makes. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org