Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 2 Aug 2013 08:41:58 -0700
From:      Alan Cox <alc@rice.edu>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        svn-src-projects@FreeBSD.org, src-committers@FreeBSD.org, Jung-uk Kim <jkim@FreeBSD.org>
Subject:   Re: svn commit: r253877 - in projects/atomic64/sys: amd64/include i386/include
Message-ID:  <ADCD3A87-EE8B-4B88-83EB-9CA3D00DA159@rice.edu>
In-Reply-To: <20130802233616.D1711@besplex.bde.org>
References:  <201308020020.r720K5Gu099845@svn.freebsd.org> <20130802233616.D1711@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On Aug 2, 2013, at 6:51 AM, Bruce Evans wrote:

> On Fri, 2 Aug 2013, Jung-uk Kim wrote:
>=20
>> Log:
>> Reimplement atomic operations on PDEs and PTEs in pmap.h.  This =
change
>> significantly reduces duplicate code.  Also, it may improve and even =
correct
>> some questionable implementations.
>=20
> Do they all (or any) need to be atomic with respect to multiple CPUs?
> It's hard to see how concurrent accesses to page tables can work worh
> without higher-level locking than is provided by atomic ops.
>=20

Some do, so that we do not lose a PG_M ("dirty") bit being set =
concurrently by another processor.  However. none of these accesses need =
to be labeled as acquires or releases.
=20
>> Modified: projects/atomic64/sys/amd64/include/pmap.h
>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D
>> --- projects/atomic64/sys/amd64/include/pmap.h	Fri Aug  2 =
00:08:00 2013	(r253876)
>> +++ projects/atomic64/sys/amd64/include/pmap.h	Fri Aug  2 =
00:20:04 2013	(r253877)
>> @@ -185,41 +185,13 @@ extern u_int64_t KPML4phys;	/* physical
>> pt_entry_t *vtopte(vm_offset_t);
>> #define	vtophys(va)	pmap_kextract(((vm_offset_t) (va)))
>>=20
>> -static __inline pt_entry_t
>> -pte_load(pt_entry_t *ptep)
>> -{
>> -	pt_entry_t r;
>> -
>> -	r =3D *ptep;
>> -	return (r);
>> -}
>=20
> This function wasn't atomic with respect to multiple CPUs.  Except on
> i386 with PAE, but then it changes a 64-bit object on a 32-bit CPU,
> so it needs some locking just to be atomic with respect to a single =
CPU.
>=20
>> -static __inline pt_entry_t
>> -pte_load_store(pt_entry_t *ptep, pt_entry_t pte)
>> -{
>> -	pt_entry_t r;
>> -
>> -	__asm __volatile(
>> -	    "xchgq %0,%1"
>> -	    : "=3Dm" (*ptep),
>> -	      "=3Dr" (r)
>> -	    : "1" (pte),
>> -	      "m" (*ptep));
>> -	return (r);
>> -}
>=20
> This was the main one that was atomic with respect to multiple CPUs on
> both amd64 and i386.  This seems to be accidental -- xchg to memory =
gives
> a lock prefix and slowness whether you want it or not.
>=20
>> -
>> -#define	pte_load_clear(pte)	atomic_readandclear_long(pte)
>> -
>> -static __inline void
>> -pte_store(pt_entry_t *ptep, pt_entry_t pte)
>> -{
>> +#define	pte_load(ptep)			=
atomic_load_acq_long(ptep)
>> +#define	pte_load_store(ptep, pte)	atomic_swap_long(ptep, =
pte)
>> +#define	pte_load_clear(pte)		atomic_swap_long(pte, 0)
>> +#define	pte_store(ptep, pte)		=
atomic_store_rel_long(ptep, pte)
>> +#define	pte_clear(ptep)			=
atomic_store_rel_long(ptep, 0)
>>=20
>> -	*ptep =3D pte;
>> -}
>=20
> pte_store() was also not atomic with respect to multiple CPUs.  So =
almost
> everything was not atomic with respect to multiple CPUs, except for =
PAE
> on i386.
>=20
> Bruce
>=20




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?ADCD3A87-EE8B-4B88-83EB-9CA3D00DA159>