From owner-svn-src-stable@FreeBSD.ORG Sat Jan 28 23:24:03 2012 Return-Path: Delivered-To: svn-src-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B59F8106564A; Sat, 28 Jan 2012 23:24:03 +0000 (UTC) (envelope-from marius@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id 94C208FC12; Sat, 28 Jan 2012 23:24:03 +0000 (UTC) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.4/8.14.4) with ESMTP id q0SNO3O1003825; Sat, 28 Jan 2012 23:24:03 GMT (envelope-from marius@svn.freebsd.org) Received: (from marius@localhost) by svn.freebsd.org (8.14.4/8.14.4/Submit) id q0SNO3Qa003822; Sat, 28 Jan 2012 23:24:03 GMT (envelope-from marius@svn.freebsd.org) Message-Id: <201201282324.q0SNO3Qa003822@svn.freebsd.org> From: Marius Strobl Date: Sat, 28 Jan 2012 23:24:03 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-9@freebsd.org X-SVN-Group: stable-9 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r230673 - stable/9/sys/sparc64/include X-BeenThere: svn-src-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for all the -stable branches of the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Jan 2012 23:24:03 -0000 Author: marius Date: Sat Jan 28 23:24:03 2012 New Revision: 230673 URL: http://svn.freebsd.org/changeset/base/230673 Log: MFC: r225889, r228222 In total store which we use for running the kernel and all of the userland atomic operations behave as if they were followed by a CPU memory barrier so there's no need to include ones in the acquire variants of atomic(9) and it's sufficient to just use include compiler memory barriers to satisfy the requirements of atomic(9). Removing the CPU memory barriers results in a small performance improvement, specifically this is sufficient to compensate the performance loss seen in the worldstone benchmark seen when using SCHED_ULE instead of SCHED_4BSD. This change is inspired by Linux even more radically doing the equivalent thing some time ago. Thanks go to Peter Jeremy for additional testing. Modified: stable/9/sys/sparc64/include/atomic.h Directory Properties: stable/9/sys/ (props changed) stable/9/sys/amd64/include/xen/ (props changed) stable/9/sys/boot/ (props changed) stable/9/sys/boot/i386/efi/ (props changed) stable/9/sys/boot/ia64/efi/ (props changed) stable/9/sys/boot/ia64/ski/ (props changed) stable/9/sys/boot/powerpc/boot1.chrp/ (props changed) stable/9/sys/boot/powerpc/ofw/ (props changed) stable/9/sys/cddl/contrib/opensolaris/ (props changed) stable/9/sys/conf/ (props changed) stable/9/sys/contrib/dev/acpica/ (props changed) stable/9/sys/contrib/octeon-sdk/ (props changed) stable/9/sys/contrib/pf/ (props changed) stable/9/sys/contrib/x86emu/ (props changed) Modified: stable/9/sys/sparc64/include/atomic.h ============================================================================== --- stable/9/sys/sparc64/include/atomic.h Sat Jan 28 23:18:02 2012 (r230672) +++ stable/9/sys/sparc64/include/atomic.h Sat Jan 28 23:24:03 2012 (r230673) @@ -74,12 +74,16 @@ * * the return value of cas is used to avoid the extra reload. * - * The memory barriers provided by the acq and rel variants are intended - * to be sufficient for use of relaxed memory ordering. Due to the - * suggested assembly syntax of the membar operands containing a # - * character, they cannot be used in macros. The cmask and mmask bits + * We only include a memory barrier in the rel variants as in total store + * order which we use for running the kernel and all of the userland atomic + * loads and stores behave as if the were followed by a membar with a mask + * of #LoadLoad | #LoadStore | #StoreStore. In order to be also sufficient + * for use of relaxed memory ordering, the atomic_cas() in the acq variants + * additionally would have to be followed by a membar #LoadLoad | #LoadStore. + * Due to the suggested assembly syntax of the membar operands containing a + * # character, they cannot be used in macros. The cmask and mmask bits thus * are hard coded in machine/cpufunc.h and used here through macros. - * Hopefully sun will choose not to change the bit numbers. + * Hopefully the bit numbers won't change in the future. */ #define itype(sz) uint ## sz ## _t @@ -93,7 +97,7 @@ #define atomic_cas_acq(p, e, s, sz) ({ \ itype(sz) v; \ v = atomic_cas(p, e, s, sz); \ - membar(LoadLoad | LoadStore); \ + __asm __volatile("" : : : "memory"); \ v; \ }) @@ -118,7 +122,7 @@ #define atomic_op_acq(p, op, v, sz) ({ \ itype(sz) t; \ t = atomic_op(p, op, v, sz); \ - membar(LoadLoad | LoadStore); \ + __asm __volatile("" : : : "memory"); \ t; \ }) @@ -135,7 +139,7 @@ #define atomic_load_acq(p, sz) ({ \ itype(sz) v; \ v = atomic_load(p, sz); \ - membar(LoadLoad | LoadStore); \ + __asm __volatile("" : : : "memory"); \ v; \ })