From owner-svn-src-all@freebsd.org Wed Feb 14 00:31:46 2018 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8D784F0FE2A; Wed, 14 Feb 2018 00:31:46 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 210727ACAE; Wed, 14 Feb 2018 00:31:46 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id E277A2EE; Wed, 14 Feb 2018 00:31:45 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w1E0VjfR009350; Wed, 14 Feb 2018 00:31:45 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w1E0Vj4w009349; Wed, 14 Feb 2018 00:31:45 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201802140031.w1E0Vj4w009349@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Wed, 14 Feb 2018 00:31:45 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r329254 - head/sys/vm X-SVN-Group: head X-SVN-Commit-Author: kib X-SVN-Commit-Paths: head/sys/vm X-SVN-Commit-Revision: 329254 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Feb 2018 00:31:46 -0000 Author: kib Date: Wed Feb 14 00:31:45 2018 New Revision: 329254 URL: https://svnweb.freebsd.org/changeset/base/329254 Log: Ensure memory consistency on COW. From the submitter description: The process is forked transitioning a map entry to COW Thread A writes to a page on the map entry, faults, updates the pmap to writable at a new phys addr, and starts TLB invalidations... Thread B acquires a lock, writes to a location on the new phys addr, and releases the lock Thread C acquires the lock, reads from the location on the old phys addr... Thread A ...continues the TLB invalidations which are completed Thread C ...reads from the location on the new phys addr, and releases the lock In this example Thread B and C [lock, use and unlock] properly and neither own the lock at the same time. Thread A was writing somewhere else on the page and so never had/needed the lock. Thread C sees a location that is only ever read|modified under a lock change beneath it while it is the lock owner. To fix this, perform the two-stage update of the copied PTE. First, the PTE is updated with the address of the new physical page with copied content, but in read-only mode. The pmap locking and the page busy state during PTE update and TLB invalidation IPIs ensure that any writer to the page cannot upgrade the PTE to the writable state until all CPUs updated their TLB to not cache old mapping. Then, after the busy state of the page is lifted, the faults for write can proceed and do not violate the consistency of the reads. The change is done in vm_fault because most architectures do need IPIs to invalidate remote TLBs. More, I think that hardware guarantees of atomicity of the remote TLB invalidation are not enough to prevent the inconsistent reads of non-atomic reads, like multi-word accesses protected by a lock. So instead of modifying each pmap invalidation code, I did it there. Discovered and analyzed by: Elliott.Rabe@dell.com Reviewed by: markj PR: 225584 (appeared to have the same cause) Tested by: Elliott.Rabe@dell.com, emaste, Mike Tancsa , truckman Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D14347 Modified: head/sys/vm/vm_fault.c Modified: head/sys/vm/vm_fault.c ============================================================================== --- head/sys/vm/vm_fault.c Wed Feb 14 00:31:37 2018 (r329253) +++ head/sys/vm/vm_fault.c Wed Feb 14 00:31:45 2018 (r329254) @@ -1135,6 +1135,10 @@ readrest: */ pmap_copy_page(fs.m, fs.first_m); fs.first_m->valid = VM_PAGE_BITS_ALL; + if ((fault_flags & VM_FAULT_WIRE) == 0) { + prot &= ~VM_PROT_WRITE; + fault_type &= ~VM_PROT_WRITE; + } if (wired && (fault_flags & VM_FAULT_WIRE) == 0) { vm_page_lock(fs.first_m);