From nobody Mon Dec 12 18:09:09 2022 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4NW8lT45bLz4kvcH; Mon, 12 Dec 2022 18:09:09 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4NW8lT3Ynfz3n3l; Mon, 12 Dec 2022 18:09:09 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1670868549; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=n7qIAmGrv9EcBDb83MwETbxO8gHIizqIlah8fqohjvM=; b=pgpu5TGz+P0jpXclEgbxRTUF2Dl+i60OoBCUdk8wuEW62WNV0/BPDmRkyzX2pWOoxjSpao Jj4Uo515xV0SqRbzaIdkpUpHLsFwpaXUTXMmfSje/aYLy8cYULeBCDkMWVi7oooKWDILqC I67Dro6x5ydVzz1Ex8/sjYX0I9n2dhHqp2JyMOhDovmjOYKGBAjATZuRYy8dx2j6RDeJHk 3PfsoRA9H3BYrVijzJjt9kjuBMlOA/39a+2VOmxTw8hSPw8CVvCyJ1kP+HE6L4nM60xcFJ RXUiSwFKEtMEacAL2pDCGdAKPV4SkG1LNSf5bKz7PllbWhobBAO8CwEgm0506w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1670868549; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=n7qIAmGrv9EcBDb83MwETbxO8gHIizqIlah8fqohjvM=; b=HafJ0iUXxRmttBCkDVlo7mNAZbVrMwEjewqYjnAEG2/1SHCfmYvOGm78UFszqvBBdP3o7d SdhVAqBWz3PFxP+Bi5gGfiTuzcqtlMajQuK7+zb8IWwbYYDhirI7hB76UbIc6vj0e0bs/0 Es+7Lek48fN3ta/b1JXK22H3aYL2Lt3VZl8Jav1IJE5oPDcduhusXhfVgcmWxn6TnIlT0m C8oycioR332BSfta3TLbN+/+Yv1abHVIfgNC/p46l7PCgaIvWon1DJQ/e30NzyCnlc5r8u 5F5mtI7FJ2oGIaGMnIxiCCBpRqFRyVcXYIJEuXY7byStbWmP8T0UX2KxRMVKKQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1670868549; a=rsa-sha256; cv=none; b=pxraZHWD6MsDkcdPMs/ZBVz2jt8qQ+I0Hre3MD6QdPk/OZJ2uxKTYRzF/z1scAmorbVWZ+ aVtAacJG7i8bvI4aC1CwZ1jN/W8gNQGynsc43OEUWGZQhsCMtgafTR1dzAOc3wPJ7CEKvQ 1Fu4QHWw7v8g+DIpWPS9FEJz5Xp87JDtwbq059KYXAlWkvdAf9HRqYLuRV2o/IcR4WRFS0 qatQRLeEK4iIjCWl431Gv4PTB1WTtfpo4oIz62xcTZ33kl+taJEU/nJrbvgW/+iF3OSuGf 7NCrchyG9O8ya5qNqDTpEEAQVryF1fn1+CfgamRpql1Flf/3SHK63sSuu4nfPQ== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4NW8lT2Py1zk7Y; Mon, 12 Dec 2022 18:09:09 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 2BCI99bt073666; Mon, 12 Dec 2022 18:09:09 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 2BCI99tp073665; Mon, 12 Dec 2022 18:09:09 GMT (envelope-from git) Date: Mon, 12 Dec 2022 18:09:09 GMT Message-Id: <202212121809.2BCI99tp073665@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Alan Cox Subject: git: f0878da03b37 - main - pmap: standardize promotion conditions between amd64 and arm64 List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-main@freebsd.org X-BeenThere: dev-commits-src-main@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: alc X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: f0878da03b374e3fa3578b363f02bfd50ac0e5bd Auto-Submitted: auto-generated X-ThisMailContainsUnwantedMimeParts: N The branch main has been updated by alc: URL: https://cgit.FreeBSD.org/src/commit/?id=f0878da03b374e3fa3578b363f02bfd50ac0e5bd commit f0878da03b374e3fa3578b363f02bfd50ac0e5bd Author: Alan Cox AuthorDate: 2022-10-08 07:20:25 +0000 Commit: Alan Cox CommitDate: 2022-12-12 17:32:50 +0000 pmap: standardize promotion conditions between amd64 and arm64 On amd64, don't abort promotion due to a missing accessed bit in a mapping before possibly write protecting that mapping. Previously, in some cases, we might not repromote after madvise(MADV_FREE) because there was no write fault to trigger the repromotion. Conversely, on arm64, don't pointlessly, yet harmlessly, write protect physical pages that aren't part of the physical superpage. Don't count aborted promotions due to explicit promotion prohibition (arm64) or hardware errata (amd64) as ordinary promotion failures. Reviewed by: kib, markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D36916 --- sys/amd64/amd64/pmap.c | 37 ++++++++++++++++++++++++++++++------- sys/arm64/arm64/pmap.c | 50 ++++++++++++++++++++++++++++++++++++++++++++------ 2 files changed, 74 insertions(+), 13 deletions(-) diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c index eb8980ae4fed..a44993efb409 100644 --- a/sys/amd64/amd64/pmap.c +++ b/sys/amd64/amd64/pmap.c @@ -6771,19 +6771,36 @@ pmap_promote_pde(pmap_t pmap, pd_entry_t *pde, vm_offset_t va, vm_page_t mpte, /* * Examine the first PTE in the specified PTP. Abort if this PTE is - * either invalid, unused, or does not map the first 4KB physical page - * within a 2MB page. + * ineligible for promotion due to hardware errata, invalid, or does + * not map the first 4KB physical page within a 2MB page. */ firstpte = (pt_entry_t *)PHYS_TO_DMAP(*pde & PG_FRAME); newpde = *firstpte; - if ((newpde & ((PG_FRAME & PDRMASK) | PG_A | PG_V)) != (PG_A | PG_V) || - !pmap_allow_2m_x_page(pmap, pmap_pde_ept_executable(pmap, - newpde))) { + if (!pmap_allow_2m_x_page(pmap, pmap_pde_ept_executable(pmap, newpde))) + return; + if ((newpde & ((PG_FRAME & PDRMASK) | PG_V)) != PG_V) { counter_u64_add(pmap_pde_p_failures, 1); CTR2(KTR_PMAP, "pmap_promote_pde: failure for va %#lx" " in pmap %p", va, pmap); return; } + + /* + * Both here and in the below "for" loop, to allow for repromotion + * after MADV_FREE, conditionally write protect a clean PTE before + * possibly aborting the promotion due to other PTE attributes. Why? + * Suppose that MADV_FREE is applied to a part of a superpage, the + * address range [S, E). pmap_advise() will demote the superpage + * mapping, destroy the 4KB page mapping at the end of [S, E), and + * clear PG_M and PG_A in the PTEs for the rest of [S, E). Later, + * imagine that the memory in [S, E) is recycled, but the last 4KB + * page in [S, E) is not the last to be rewritten, or simply accessed. + * In other words, there is still a 4KB page in [S, E), call it P, + * that is writeable but PG_M and PG_A are clear in P's PTE. Unless + * we write protect P before aborting the promotion, if and when P is + * finally rewritten, there won't be a page fault to trigger + * repromotion. + */ setpde: if ((newpde & (PG_M | PG_RW)) == PG_RW) { /* @@ -6794,16 +6811,22 @@ setpde: goto setpde; newpde &= ~PG_RW; } + if ((newpde & PG_A) == 0) { + counter_u64_add(pmap_pde_p_failures, 1); + CTR2(KTR_PMAP, "pmap_promote_pde: failure for va %#lx" + " in pmap %p", va, pmap); + return; + } /* * Examine each of the other PTEs in the specified PTP. Abort if this * PTE maps an unexpected 4KB physical page or does not have identical * characteristics to the first PTE. */ - pa = (newpde & (PG_PS_FRAME | PG_A | PG_V)) + NBPDR - PAGE_SIZE; + pa = (newpde & (PG_PS_FRAME | PG_V)) + NBPDR - PAGE_SIZE; for (pte = firstpte + NPTEPG - 1; pte > firstpte; pte--) { oldpte = *pte; - if ((oldpte & (PG_FRAME | PG_A | PG_V)) != pa) { + if ((oldpte & (PG_FRAME | PG_V)) != pa) { counter_u64_add(pmap_pde_p_failures, 1); CTR2(KTR_PMAP, "pmap_promote_pde: failure for va %#lx" " in pmap %p", va, pmap); diff --git a/sys/arm64/arm64/pmap.c b/sys/arm64/arm64/pmap.c index 3f4665921631..7e2a423025ec 100644 --- a/sys/arm64/arm64/pmap.c +++ b/sys/arm64/arm64/pmap.c @@ -3955,17 +3955,38 @@ pmap_promote_l2(pmap_t pmap, pd_entry_t *l2, vm_offset_t va, vm_page_t mpte, PMAP_LOCK_ASSERT(pmap, MA_OWNED); PMAP_ASSERT_STAGE1(pmap); + /* + * Examine the first L3E in the specified PTP. Abort if this L3E is + * ineligible for promotion, invalid, or does not map the first 4KB + * physical page within a 2MB page. + */ firstl3 = (pt_entry_t *)PHYS_TO_DMAP(pmap_load(l2) & ~ATTR_MASK); newl2 = pmap_load(firstl3); - - if (((newl2 & (~ATTR_MASK | ATTR_AF)) & L2_OFFSET) != ATTR_AF || - (newl2 & ATTR_SW_NO_PROMOTE) != 0) { + if ((newl2 & ATTR_SW_NO_PROMOTE) != 0) + return; + if ((newl2 & ((~ATTR_MASK & L2_OFFSET) | ATTR_DESCR_MASK)) != L3_PAGE) { atomic_add_long(&pmap_l2_p_failures, 1); CTR2(KTR_PMAP, "pmap_promote_l2: failure for va %#lx" " in pmap %p", va, pmap); return; } + /* + * Both here and in the below "for" loop, to allow for repromotion + * after MADV_FREE, conditionally write protect a clean L3E before + * possibly aborting the promotion due to other L3E attributes. Why? + * Suppose that MADV_FREE is applied to a part of a superpage, the + * address range [S, E). pmap_advise() will demote the superpage + * mapping, destroy the 4KB page mapping at the end of [S, E), and + * set AP_RO and clear AF in the L3Es for the rest of [S, E). Later, + * imagine that the memory in [S, E) is recycled, but the last 4KB + * page in [S, E) is not the last to be rewritten, or simply accessed. + * In other words, there is still a 4KB page in [S, E), call it P, + * that is writeable but AP_RO is set and AF is clear in P's L3E. + * Unless we write protect P before aborting the promotion, if and + * when P is finally rewritten, there won't be a page fault to trigger + * repromotion. + */ setl2: if ((newl2 & (ATTR_S1_AP_RW_BIT | ATTR_SW_DBM)) == (ATTR_S1_AP(ATTR_S1_AP_RO) | ATTR_SW_DBM)) { @@ -3977,10 +3998,27 @@ setl2: goto setl2; newl2 &= ~ATTR_SW_DBM; } + if ((newl2 & ATTR_AF) == 0) { + atomic_add_long(&pmap_l2_p_failures, 1); + CTR2(KTR_PMAP, "pmap_promote_l2: failure for va %#lx" + " in pmap %p", va, pmap); + return; + } - pa = newl2 + L2_SIZE - PAGE_SIZE; + /* + * Examine each of the other L3Es in the specified PTP. Abort if this + * L3E maps an unexpected 4KB physical page or does not have identical + * characteristics to the first L3E. + */ + pa = (newl2 & (~ATTR_MASK | ATTR_DESCR_MASK)) + L2_SIZE - PAGE_SIZE; for (l3 = firstl3 + NL3PG - 1; l3 > firstl3; l3--) { oldl3 = pmap_load(l3); + if ((oldl3 & (~ATTR_MASK | ATTR_DESCR_MASK)) != pa) { + atomic_add_long(&pmap_l2_p_failures, 1); + CTR2(KTR_PMAP, "pmap_promote_l2: failure for va %#lx" + " in pmap %p", va, pmap); + return; + } setl3: if ((oldl3 & (ATTR_S1_AP_RW_BIT | ATTR_SW_DBM)) == (ATTR_S1_AP(ATTR_S1_AP_RO) | ATTR_SW_DBM)) { @@ -3994,7 +4032,7 @@ setl3: goto setl3; oldl3 &= ~ATTR_SW_DBM; } - if (oldl3 != pa) { + if ((oldl3 & ATTR_MASK) != (newl2 & ATTR_MASK)) { atomic_add_long(&pmap_l2_p_failures, 1); CTR2(KTR_PMAP, "pmap_promote_l2: failure for va %#lx" " in pmap %p", va, pmap); @@ -4033,7 +4071,7 @@ setl3: atomic_add_long(&pmap_l2_promotions, 1); CTR2(KTR_PMAP, "pmap_promote_l2: success for va %#lx in pmap %p", va, - pmap); + pmap); } #endif /* VM_NRESERVLEVEL > 0 */