From owner-svn-src-projects@FreeBSD.ORG Fri Aug 2 15:42:18 2013 Return-Path: Delivered-To: svn-src-projects@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id D05A3304; Fri, 2 Aug 2013 15:42:18 +0000 (UTC) (envelope-from alc@rice.edu) Received: from pp1.rice.edu (proofpoint1.mail.rice.edu [128.42.201.100]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 9DC8B2D06; Fri, 2 Aug 2013 15:42:18 +0000 (UTC) Received: from pps.filterd (pp1.rice.edu [127.0.0.1]) by pp1.rice.edu (8.14.5/8.14.5) with SMTP id r727McW3006998; Fri, 2 Aug 2013 10:42:16 -0500 Received: from mh10.mail.rice.edu (mh10.mail.rice.edu [128.42.201.30]) by pp1.rice.edu with ESMTP id 1dsymjmhvw-1; Fri, 02 Aug 2013 10:42:16 -0500 X-Virus-Scanned: by amavis-2.7.0 at mh10.mail.rice.edu, auth channel Received: from [192.168.5.247] (unknown [12.107.116.132]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: alc) by mh10.mail.rice.edu (Postfix) with ESMTPSA id DCE48603E2; Fri, 2 Aug 2013 10:42:15 -0500 (CDT) Subject: Re: svn commit: r253877 - in projects/atomic64/sys: amd64/include i386/include Mime-Version: 1.0 (Apple Message framework v1085) Content-Type: text/plain; charset=us-ascii From: Alan Cox In-Reply-To: <20130802233616.D1711@besplex.bde.org> Date: Fri, 2 Aug 2013 08:41:58 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: References: <201308020020.r720K5Gu099845@svn.freebsd.org> <20130802233616.D1711@besplex.bde.org> To: Bruce Evans X-Mailer: Apple Mail (2.1085) Cc: svn-src-projects@FreeBSD.org, src-committers@FreeBSD.org, Jung-uk Kim X-BeenThere: svn-src-projects@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "SVN commit messages for the src " projects" tree" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Aug 2013 15:42:19 -0000 On Aug 2, 2013, at 6:51 AM, Bruce Evans wrote: > On Fri, 2 Aug 2013, Jung-uk Kim wrote: >=20 >> Log: >> Reimplement atomic operations on PDEs and PTEs in pmap.h. This = change >> significantly reduces duplicate code. Also, it may improve and even = correct >> some questionable implementations. >=20 > Do they all (or any) need to be atomic with respect to multiple CPUs? > It's hard to see how concurrent accesses to page tables can work worh > without higher-level locking than is provided by atomic ops. >=20 Some do, so that we do not lose a PG_M ("dirty") bit being set = concurrently by another processor. However. none of these accesses need = to be labeled as acquires or releases. =20 >> Modified: projects/atomic64/sys/amd64/include/pmap.h >> = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D >> --- projects/atomic64/sys/amd64/include/pmap.h Fri Aug 2 = 00:08:00 2013 (r253876) >> +++ projects/atomic64/sys/amd64/include/pmap.h Fri Aug 2 = 00:20:04 2013 (r253877) >> @@ -185,41 +185,13 @@ extern u_int64_t KPML4phys; /* physical >> pt_entry_t *vtopte(vm_offset_t); >> #define vtophys(va) pmap_kextract(((vm_offset_t) (va))) >>=20 >> -static __inline pt_entry_t >> -pte_load(pt_entry_t *ptep) >> -{ >> - pt_entry_t r; >> - >> - r =3D *ptep; >> - return (r); >> -} >=20 > This function wasn't atomic with respect to multiple CPUs. Except on > i386 with PAE, but then it changes a 64-bit object on a 32-bit CPU, > so it needs some locking just to be atomic with respect to a single = CPU. >=20 >> -static __inline pt_entry_t >> -pte_load_store(pt_entry_t *ptep, pt_entry_t pte) >> -{ >> - pt_entry_t r; >> - >> - __asm __volatile( >> - "xchgq %0,%1" >> - : "=3Dm" (*ptep), >> - "=3Dr" (r) >> - : "1" (pte), >> - "m" (*ptep)); >> - return (r); >> -} >=20 > This was the main one that was atomic with respect to multiple CPUs on > both amd64 and i386. This seems to be accidental -- xchg to memory = gives > a lock prefix and slowness whether you want it or not. >=20 >> - >> -#define pte_load_clear(pte) atomic_readandclear_long(pte) >> - >> -static __inline void >> -pte_store(pt_entry_t *ptep, pt_entry_t pte) >> -{ >> +#define pte_load(ptep) = atomic_load_acq_long(ptep) >> +#define pte_load_store(ptep, pte) atomic_swap_long(ptep, = pte) >> +#define pte_load_clear(pte) atomic_swap_long(pte, 0) >> +#define pte_store(ptep, pte) = atomic_store_rel_long(ptep, pte) >> +#define pte_clear(ptep) = atomic_store_rel_long(ptep, 0) >>=20 >> - *ptep =3D pte; >> -} >=20 > pte_store() was also not atomic with respect to multiple CPUs. So = almost > everything was not atomic with respect to multiple CPUs, except for = PAE > on i386. >=20 > Bruce >=20