Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 14 Feb 2018 00:31:45 +0000 (UTC)
From:      Konstantin Belousov <kib@FreeBSD.org>
To:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   svn commit: r329254 - head/sys/vm
Message-ID:  <201802140031.w1E0Vj4w009349@repo.freebsd.org>

next in thread | raw e-mail | index | archive | help
Author: kib
Date: Wed Feb 14 00:31:45 2018
New Revision: 329254
URL: https://svnweb.freebsd.org/changeset/base/329254

Log:
  Ensure memory consistency on COW.
  
  From the submitter description:
  The process is forked transitioning a map entry to COW
  Thread A writes to a page on the map entry, faults, updates the pmap to
    writable at a new phys addr, and starts TLB invalidations...
  Thread B acquires a lock, writes to a location on the new phys addr, and
    releases the lock
  Thread C acquires the lock, reads from the location on the old phys addr...
  Thread A ...continues the TLB invalidations which are completed
  Thread C ...reads from the location on the new phys addr, and releases
    the lock
  
  In this example Thread B and C [lock, use and unlock] properly and
  neither own the lock at the same time.  Thread A was writing somewhere
  else on the page and so never had/needed the lock. Thread C sees a
  location that is only ever read|modified under a lock change beneath
  it while it is the lock owner.
  
  To fix this, perform the two-stage update of the copied PTE.  First,
  the PTE is updated with the address of the new physical page with
  copied content, but in read-only mode.  The pmap locking and the page
  busy state during PTE update and TLB invalidation IPIs ensure that any
  writer to the page cannot upgrade the PTE to the writable state until
  all CPUs updated their TLB to not cache old mapping.  Then, after the
  busy state of the page is lifted, the faults for write can proceed and
  do not violate the consistency of the reads.
  
  The change is done in vm_fault because most architectures do need IPIs
  to invalidate remote TLBs.  More, I think that hardware guarantees of
  atomicity of the remote TLB invalidation are not enough to prevent the
  inconsistent reads of non-atomic reads, like multi-word accesses
  protected by a lock.  So instead of modifying each pmap invalidation
  code, I did it there.
  
  Discovered and analyzed by: Elliott.Rabe@dell.com
  Reviewed by:	markj
  PR:	225584 (appeared to have the same cause)
  Tested by:	Elliott.Rabe@dell.com, emaste, Mike Tancsa <mike@sentex.net>, truckman
  Sponsored by:	The FreeBSD Foundation
  MFC after:	1 week
  Differential revision:	https://reviews.freebsd.org/D14347

Modified:
  head/sys/vm/vm_fault.c

Modified: head/sys/vm/vm_fault.c
==============================================================================
--- head/sys/vm/vm_fault.c	Wed Feb 14 00:31:37 2018	(r329253)
+++ head/sys/vm/vm_fault.c	Wed Feb 14 00:31:45 2018	(r329254)
@@ -1135,6 +1135,10 @@ readrest:
 				 */
 				pmap_copy_page(fs.m, fs.first_m);
 				fs.first_m->valid = VM_PAGE_BITS_ALL;
+				if ((fault_flags & VM_FAULT_WIRE) == 0) {
+					prot &= ~VM_PROT_WRITE;
+					fault_type &= ~VM_PROT_WRITE;
+				}
 				if (wired && (fault_flags &
 				    VM_FAULT_WIRE) == 0) {
 					vm_page_lock(fs.first_m);



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201802140031.w1E0Vj4w009349>