From owner-svn-src-projects@FreeBSD.ORG  Fri Aug  2 15:42:18 2013
Return-Path: <owner-svn-src-projects@FreeBSD.ORG>
Delivered-To: svn-src-projects@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id D05A3304;
 Fri,  2 Aug 2013 15:42:18 +0000 (UTC) (envelope-from alc@rice.edu)
Received: from pp1.rice.edu (proofpoint1.mail.rice.edu [128.42.201.100])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 9DC8B2D06;
 Fri,  2 Aug 2013 15:42:18 +0000 (UTC)
Received: from pps.filterd (pp1.rice.edu [127.0.0.1])
 by pp1.rice.edu (8.14.5/8.14.5) with SMTP id r727McW3006998;
 Fri, 2 Aug 2013 10:42:16 -0500
Received: from mh10.mail.rice.edu (mh10.mail.rice.edu [128.42.201.30])
 by pp1.rice.edu with ESMTP id 1dsymjmhvw-1;
 Fri, 02 Aug 2013 10:42:16 -0500
X-Virus-Scanned: by amavis-2.7.0 at mh10.mail.rice.edu, auth channel
Received: from [192.168.5.247] (unknown [12.107.116.132])
 (using TLSv1 with cipher RC4-MD5 (128/128 bits))
 (No client certificate requested) (Authenticated sender: alc)
 by mh10.mail.rice.edu (Postfix) with ESMTPSA id DCE48603E2;
 Fri,  2 Aug 2013 10:42:15 -0500 (CDT)
Subject: Re: svn commit: r253877 - in projects/atomic64/sys: amd64/include
 i386/include
Mime-Version: 1.0 (Apple Message framework v1085)
Content-Type: text/plain; charset=us-ascii
From: Alan Cox <alc@rice.edu>
In-Reply-To: <20130802233616.D1711@besplex.bde.org>
Date: Fri, 2 Aug 2013 08:41:58 -0700
Content-Transfer-Encoding: quoted-printable
Message-Id: <ADCD3A87-EE8B-4B88-83EB-9CA3D00DA159@rice.edu>
References: <201308020020.r720K5Gu099845@svn.freebsd.org>
 <20130802233616.D1711@besplex.bde.org>
To: Bruce Evans <brde@optusnet.com.au>
X-Mailer: Apple Mail (2.1085)
Cc: svn-src-projects@FreeBSD.org, src-committers@FreeBSD.org,
 Jung-uk Kim <jkim@FreeBSD.org>
X-BeenThere: svn-src-projects@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: "SVN commit messages for the src &quot; projects&quot;
 tree" <svn-src-projects.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/svn-src-projects>, 
 <mailto:svn-src-projects-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-projects>
List-Post: <mailto:svn-src-projects@freebsd.org>
List-Help: <mailto:svn-src-projects-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-projects>, 
 <mailto:svn-src-projects-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 02 Aug 2013 15:42:19 -0000


On Aug 2, 2013, at 6:51 AM, Bruce Evans wrote:

> On Fri, 2 Aug 2013, Jung-uk Kim wrote:
>=20
>> Log:
>> Reimplement atomic operations on PDEs and PTEs in pmap.h.  This =
change
>> significantly reduces duplicate code.  Also, it may improve and even =
correct
>> some questionable implementations.
>=20
> Do they all (or any) need to be atomic with respect to multiple CPUs?
> It's hard to see how concurrent accesses to page tables can work worh
> without higher-level locking than is provided by atomic ops.
>=20

Some do, so that we do not lose a PG_M ("dirty") bit being set =
concurrently by another processor.  However. none of these accesses need =
to be labeled as acquires or releases.
=20
>> Modified: projects/atomic64/sys/amd64/include/pmap.h
>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D
>> --- projects/atomic64/sys/amd64/include/pmap.h	Fri Aug  2 =
00:08:00 2013	(r253876)
>> +++ projects/atomic64/sys/amd64/include/pmap.h	Fri Aug  2 =
00:20:04 2013	(r253877)
>> @@ -185,41 +185,13 @@ extern u_int64_t KPML4phys;	/* physical
>> pt_entry_t *vtopte(vm_offset_t);
>> #define	vtophys(va)	pmap_kextract(((vm_offset_t) (va)))
>>=20
>> -static __inline pt_entry_t
>> -pte_load(pt_entry_t *ptep)
>> -{
>> -	pt_entry_t r;
>> -
>> -	r =3D *ptep;
>> -	return (r);
>> -}
>=20
> This function wasn't atomic with respect to multiple CPUs.  Except on
> i386 with PAE, but then it changes a 64-bit object on a 32-bit CPU,
> so it needs some locking just to be atomic with respect to a single =
CPU.
>=20
>> -static __inline pt_entry_t
>> -pte_load_store(pt_entry_t *ptep, pt_entry_t pte)
>> -{
>> -	pt_entry_t r;
>> -
>> -	__asm __volatile(
>> -	    "xchgq %0,%1"
>> -	    : "=3Dm" (*ptep),
>> -	      "=3Dr" (r)
>> -	    : "1" (pte),
>> -	      "m" (*ptep));
>> -	return (r);
>> -}
>=20
> This was the main one that was atomic with respect to multiple CPUs on
> both amd64 and i386.  This seems to be accidental -- xchg to memory =
gives
> a lock prefix and slowness whether you want it or not.
>=20
>> -
>> -#define	pte_load_clear(pte)	atomic_readandclear_long(pte)
>> -
>> -static __inline void
>> -pte_store(pt_entry_t *ptep, pt_entry_t pte)
>> -{
>> +#define	pte_load(ptep)			=
atomic_load_acq_long(ptep)
>> +#define	pte_load_store(ptep, pte)	atomic_swap_long(ptep, =
pte)
>> +#define	pte_load_clear(pte)		atomic_swap_long(pte, 0)
>> +#define	pte_store(ptep, pte)		=
atomic_store_rel_long(ptep, pte)
>> +#define	pte_clear(ptep)			=
atomic_store_rel_long(ptep, 0)
>>=20
>> -	*ptep =3D pte;
>> -}
>=20
> pte_store() was also not atomic with respect to multiple CPUs.  So =
almost
> everything was not atomic with respect to multiple CPUs, except for =
PAE
> on i386.
>=20
> Bruce
>=20