From nobody Fri Jun 27 05:43:37 2025 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4bT4Gy6mhyz60JwT; Fri, 27 Jun 2025 05:43:38 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4bT4Gx6SZYz3CVb; Fri, 27 Jun 2025 05:43:37 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1751003017; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=/4r0i+s1ls8SB+KQBKKV1fW9c1QzVmOavSLpyztwjyw=; b=ViuXY24FnjHqPDTOgthBv30iYaEE1CwUM6m4HwY07mtWnWlSNePVyWC2ybgHah1DA9QK6B Pjwxw1WBGJUG74o34MzT9kMJhF14wSI2x5iv5sDiSiIoqNcY/YKZabvdSB7vtY/hnNsnku WFsJY0VdCVIt+fhOabUFowQgXieQhqwZwyqpdMC4j4v4+DQ1kbdQEmu4AVCRvwf/zPyv53 CF7/yQmmyc4Fy58NbMn5X0UViW+oxrAXKj+/egyTpw0IqQwNx+VwL4nZBDiWyYEqKfKQeY 1sP+2F/l9iAGEO1bUO8OSyyQ6wHUtHUPi913Z3esxB4GxBL2yP0Hkd5TpakWaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1751003017; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=/4r0i+s1ls8SB+KQBKKV1fW9c1QzVmOavSLpyztwjyw=; b=w8F9XBB0eDCsiLJY6Mnuepj3lmgLgI0jKVy5TGGLrSOsDbgwEzrk9dBooA/yPbI0Tegh4H QibJiNuQjnR6ZSDPI9rlCnWBWoqRN3OlOgTiNa0S378yG7JrqQg7A4afGirWKPTtBBLiL8 qZCQlhNS0euUjrMiNKranHG6pEREoYl1IczcbgZh5F6fm7JwjFYKf+ITokYuMLfqacpbIE gs7+e/m/hxwl+Kdvv1s6k2fWpUxrdf/OTdU+2grsOHLx4caPQF++ba0jBgu+izPwX67YPJ pfFp0P5nazKMaBnuGxGB+FCS4myzuo225EPKMmVKnYLZf3xp+MtUpwORE4FCog== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1751003017; a=rsa-sha256; cv=none; b=OnAtlJPAjLDCiwwr0IEYFvsep6x5xyOqQY3WsDxnYRnIsCrfeyE2vI2mQATA+VGpLA7bKy yRJQLYQN2ZsnubVX05djYNY7q93YkO0+ogpNTWFKFlqA985UmuA1uh/w1hwefL81I1Chu5 FC4AyI6pl5fBcurgrLYqNYxJQbDQKdefbG2KbOLYLaxfJkNA5PquXenDb1qoTMj11FEsfR t61MawWGKg8XUVWlfprB/xpzb7/vNF6b25UAre82W8J7gHrsMjltydY2G2D7cZOC5oOMtP GDu+KJkQGRwqEN2KDZlNecEYS0Wc0XN+SMJdnkkUSBwN7s4swr8nMD3BWc2HVA== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4bT4Gx329TzcQG; Fri, 27 Jun 2025 05:43:37 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 55R5hbhg082994; Fri, 27 Jun 2025 05:43:37 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 55R5hboW082991; Fri, 27 Jun 2025 05:43:37 GMT (envelope-from git) Date: Fri, 27 Jun 2025 05:43:37 GMT Message-Id: <202506270543.55R5hboW082991@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Konstantin Belousov Subject: git: 4e1d69b9fbff - main - amd64: switch to la57 mode before creating kernel page tables List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-main@freebsd.org Sender: owner-dev-commits-src-main@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: kib X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 4e1d69b9fbff280962e5ae5258624b60d5ab4618 Auto-Submitted: auto-generated The branch main has been updated by kib: URL: https://cgit.FreeBSD.org/src/commit/?id=4e1d69b9fbff280962e5ae5258624b60d5ab4618 commit 4e1d69b9fbff280962e5ae5258624b60d5ab4618 Author: Konstantin Belousov AuthorDate: 2025-06-23 23:20:56 +0000 Commit: Konstantin Belousov CommitDate: 2025-06-27 04:23:20 +0000 amd64: switch to la57 mode before creating kernel page tables Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D51053 --- sys/amd64/amd64/locore.S | 2 + sys/amd64/amd64/pmap.c | 209 ++++++++++++++++------------------------------- 2 files changed, 73 insertions(+), 138 deletions(-) diff --git a/sys/amd64/amd64/locore.S b/sys/amd64/amd64/locore.S index 29fbf38cea33..2be555b25160 100644 --- a/sys/amd64/amd64/locore.S +++ b/sys/amd64/amd64/locore.S @@ -119,6 +119,8 @@ ENTRY(la57_trampoline) leaq la57_trampoline_end(%rip),%rsp /* priv stack */ movq %cr0,%rbp + leaq la57_trampoline_gdt(%rip),%rax + movq %rax,la57_trampoline_gdt_desc+2(%rip) lgdtq la57_trampoline_gdt_desc(%rip) pushq $(2<<3) diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c index 6d1c2d70d8c0..18bf2b4c92a1 100644 --- a/sys/amd64/amd64/pmap.c +++ b/sys/amd64/amd64/pmap.c @@ -1684,12 +1684,44 @@ bootaddr_rwx(vm_paddr_t pa) return (pg_nx); } +extern const char la57_trampoline[]; + +static void +pmap_bootstrap_la57(vm_paddr_t *firstaddr) +{ + void (*la57_tramp)(uint64_t pml5); + pml5_entry_t *pt; + + if ((cpu_stdext_feature2 & CPUID_STDEXT2_LA57) == 0) + return; + la57 = 1; + TUNABLE_INT_FETCH("vm.pmap.la57", &la57); + if (!la57) + return; + + KPML5phys = allocpages(firstaddr, 1); + KPML4phys = rcr3() & 0xfffff000; /* pml4 from loader must be < 4G */ + + pt = (pml5_entry_t *)KPML5phys; + pt[0] = KPML4phys | X86_PG_V | X86_PG_RW | X86_PG_A | X86_PG_M; + pt[NPML4EPG - 1] = KPML4phys | X86_PG_V | X86_PG_RW | X86_PG_A | + X86_PG_M; + + la57_tramp = (void (*)(uint64_t))((uintptr_t)la57_trampoline - + KERNSTART + amd64_loadaddr()); + printf("Calling la57 trampoline at %p, KPML5phys %#lx ...", + la57_tramp, KPML5phys); + la57_tramp(KPML5phys); + printf(" alive in la57 mode\n"); +} + static void create_pagetables(vm_paddr_t *firstaddr) { pd_entry_t *pd_p; pdp_entry_t *pdp_p; pml4_entry_t *p4_p; + pml5_entry_t *p5_p; uint64_t DMPDkernphys; vm_paddr_t pax; #ifdef KASAN @@ -1917,6 +1949,27 @@ create_pagetables(vm_paddr_t *firstaddr) } kernel_pml4 = (pml4_entry_t *)PHYS_TO_DMAP(KPML4phys); + + if (la57) { + /* XXXKIB bootstrap KPML5phys page is lost */ + KPML5phys = allocpages(firstaddr, 1); + for (i = 0, p5_p = (pml5_entry_t *)KPML5phys; i < NPML5EPG; + i++) { + if (i == PML5PML5I) { + /* + * Recursively map PML5 to itself in + * order to get PTmap and PDmap. + */ + p5_p[i] = KPML5phys | X86_PG_RW | X86_PG_A | + X86_PG_M | X86_PG_V | pg_nx; + } else if (i == pmap_pml5e_index(UPT_MAX_ADDRESS)) { + p5_p[i] = KPML4phys | X86_PG_RW | X86_PG_A | + X86_PG_M | X86_PG_V; + } else { + p5_p[i] = 0; + } + } + } TSEXIT(); } @@ -1950,6 +2003,7 @@ pmap_bootstrap(vm_paddr_t *firstaddr) /* * Create an initial set of page tables to run the kernel in. */ + pmap_bootstrap_la57(firstaddr); create_pagetables(firstaddr); pcpu0_phys = allocpages(firstaddr, 1); @@ -1979,7 +2033,7 @@ pmap_bootstrap(vm_paddr_t *firstaddr) cr4 = rcr4(); cr4 |= CR4_PGE; load_cr4(cr4); - load_cr3(KPML4phys); + load_cr3(la57 ? KPML5phys : KPML4phys); if (cpu_stdext_feature & CPUID_STDEXT_SMEP) cr4 |= CR4_SMEP; if (cpu_stdext_feature & CPUID_STDEXT_SMAP) @@ -1992,8 +2046,20 @@ pmap_bootstrap(vm_paddr_t *firstaddr) * later unmapped (using pmap_remove()) and freed. */ PMAP_LOCK_INIT(kernel_pmap); - kernel_pmap->pm_pmltop = kernel_pml4; - kernel_pmap->pm_cr3 = KPML4phys; + if (la57) { + vtoptem = ((1ul << (NPTEPGSHIFT + NPDEPGSHIFT + NPDPEPGSHIFT + + NPML4EPGSHIFT + NPML5EPGSHIFT)) - 1) << 3; + PTmap = (vm_offset_t)P5Tmap; + vtopdem = ((1ul << (NPDEPGSHIFT + NPDPEPGSHIFT + + NPML4EPGSHIFT + NPML5EPGSHIFT)) - 1) << 3; + PDmap = (vm_offset_t)P5Dmap; + kernel_pmap->pm_pmltop = (void *)PHYS_TO_DMAP(KPML5phys); + kernel_pmap->pm_cr3 = KPML5phys; + pmap_pt_page_count_adj(kernel_pmap, 1); /* top-level page */ + } else { + kernel_pmap->pm_pmltop = kernel_pml4; + kernel_pmap->pm_cr3 = KPML4phys; + } kernel_pmap->pm_ucr3 = PMAP_NO_CR3; TAILQ_INIT(&kernel_pmap->pm_pvchunk); kernel_pmap->pm_stats.resident_count = res; @@ -2048,6 +2114,8 @@ pmap_bootstrap(vm_paddr_t *firstaddr) /* * Re-initialize PCPU area for BSP after switching. * Make hardware use gdt and common_tss from the new PCPU. + * Also clears the usage of temporary gdt during switch to + * LA57 paging. */ STAILQ_INIT(&cpuhead); wrmsr(MSR_GSBASE, (uint64_t)&__pcpu[0]); @@ -2177,141 +2245,6 @@ pmap_page_alloc_below_4g(bool zeroed) 1, 0, (1ULL << 32), PAGE_SIZE, 0, VM_MEMATTR_DEFAULT)); } -extern const char la57_trampoline[], la57_trampoline_gdt_desc[], - la57_trampoline_gdt[], la57_trampoline_end[]; - -static void -pmap_bootstrap_la57(void *arg __unused) -{ - char *v_code; - pml5_entry_t *v_pml5; - pml4_entry_t *v_pml4; - pdp_entry_t *v_pdp; - pd_entry_t *v_pd; - pt_entry_t *v_pt; - vm_page_t m_code, m_pml4, m_pdp, m_pd, m_pt, m_pml5; - void (*la57_tramp)(uint64_t pml5); - struct region_descriptor r_gdt; - - if ((cpu_stdext_feature2 & CPUID_STDEXT2_LA57) == 0) - return; - la57 = 1; - TUNABLE_INT_FETCH("vm.pmap.la57", &la57); - if (!la57) - return; - - r_gdt.rd_limit = NGDT * sizeof(struct user_segment_descriptor) - 1; - r_gdt.rd_base = (long)__pcpu[0].pc_gdt; - - m_code = pmap_page_alloc_below_4g(true); - v_code = (char *)PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m_code)); - m_pml5 = pmap_page_alloc_below_4g(true); - KPML5phys = VM_PAGE_TO_PHYS(m_pml5); - v_pml5 = (pml5_entry_t *)PHYS_TO_DMAP(KPML5phys); - m_pml4 = pmap_page_alloc_below_4g(true); - v_pml4 = (pdp_entry_t *)PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m_pml4)); - m_pdp = pmap_page_alloc_below_4g(true); - v_pdp = (pdp_entry_t *)PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m_pdp)); - m_pd = pmap_page_alloc_below_4g(true); - v_pd = (pdp_entry_t *)PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m_pd)); - m_pt = pmap_page_alloc_below_4g(true); - v_pt = (pt_entry_t *)PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m_pt)); - - /* - * Map m_code 1:1, it appears below 4G in KVA due to physical - * address being below 4G. Since kernel KVA is in upper half, - * the pml4e should be zero and free for temporary use. - */ - kernel_pmap->pm_pmltop[pmap_pml4e_index(VM_PAGE_TO_PHYS(m_code))] = - VM_PAGE_TO_PHYS(m_pdp) | X86_PG_V | X86_PG_RW | X86_PG_A | - X86_PG_M; - v_pdp[pmap_pdpe_index(VM_PAGE_TO_PHYS(m_code))] = - VM_PAGE_TO_PHYS(m_pd) | X86_PG_V | X86_PG_RW | X86_PG_A | - X86_PG_M; - v_pd[pmap_pde_index(VM_PAGE_TO_PHYS(m_code))] = - VM_PAGE_TO_PHYS(m_pt) | X86_PG_V | X86_PG_RW | X86_PG_A | - X86_PG_M; - v_pt[pmap_pte_index(VM_PAGE_TO_PHYS(m_code))] = - VM_PAGE_TO_PHYS(m_code) | X86_PG_V | X86_PG_RW | X86_PG_A | - X86_PG_M; - - /* - * Add pml5 entry at top of KVA pointing to existing pml4 table, - * entering all existing kernel mappings into level 5 table. - */ - v_pml5[pmap_pml5e_index(UPT_MAX_ADDRESS)] = KPML4phys | X86_PG_V | - X86_PG_RW | X86_PG_A | X86_PG_M; - - /* - * Add pml5 entry for 1:1 trampoline mapping after LA57 is turned on. - */ - v_pml5[pmap_pml5e_index(VM_PAGE_TO_PHYS(m_code))] = - VM_PAGE_TO_PHYS(m_pml4) | X86_PG_V | X86_PG_RW | X86_PG_A | - X86_PG_M; - v_pml4[pmap_pml4e_index(VM_PAGE_TO_PHYS(m_code))] = - VM_PAGE_TO_PHYS(m_pdp) | X86_PG_V | X86_PG_RW | X86_PG_A | - X86_PG_M; - - /* - * Copy and call the 48->57 trampoline, hope we return there, alive. - */ - bcopy(la57_trampoline, v_code, la57_trampoline_end - la57_trampoline); - *(u_long *)(v_code + 2 + (la57_trampoline_gdt_desc - la57_trampoline)) = - la57_trampoline_gdt - la57_trampoline + VM_PAGE_TO_PHYS(m_code); - la57_tramp = (void (*)(uint64_t))VM_PAGE_TO_PHYS(m_code); - pmap_invalidate_all(kernel_pmap); - if (bootverbose) { - printf("entering LA57 trampoline at %#lx\n", - (vm_offset_t)la57_tramp); - } - la57_tramp(KPML5phys); - - /* - * gdt was necessary reset, switch back to our gdt. - */ - lgdt(&r_gdt); - wrmsr(MSR_GSBASE, (uint64_t)&__pcpu[0]); - load_ds(_udatasel); - load_es(_udatasel); - load_fs(_ufssel); - ssdtosyssd(&gdt_segs[GPROC0_SEL], - (struct system_segment_descriptor *)&__pcpu[0].pc_gdt[GPROC0_SEL]); - ltr(GSEL(GPROC0_SEL, SEL_KPL)); - lidt(&r_idt); - - if (bootverbose) - printf("LA57 trampoline returned, CR4 %#lx\n", rcr4()); - - /* - * Now unmap the trampoline, and free the pages. - * Clear pml5 entry used for 1:1 trampoline mapping. - */ - pte_clear(&v_pml5[pmap_pml5e_index(VM_PAGE_TO_PHYS(m_code))]); - invlpg((vm_offset_t)v_code); - vm_page_free(m_code); - vm_page_free(m_pdp); - vm_page_free(m_pd); - vm_page_free(m_pt); - - /* - * Recursively map PML5 to itself in order to get PTmap and - * PDmap. - */ - v_pml5[PML5PML5I] = KPML5phys | X86_PG_RW | X86_PG_V | pg_nx; - - vtoptem = ((1ul << (NPTEPGSHIFT + NPDEPGSHIFT + NPDPEPGSHIFT + - NPML4EPGSHIFT + NPML5EPGSHIFT)) - 1) << 3; - PTmap = (vm_offset_t)P5Tmap; - vtopdem = ((1ul << (NPDEPGSHIFT + NPDPEPGSHIFT + - NPML4EPGSHIFT + NPML5EPGSHIFT)) - 1) << 3; - PDmap = (vm_offset_t)P5Dmap; - - kernel_pmap->pm_cr3 = KPML5phys; - kernel_pmap->pm_pmltop = v_pml5; - pmap_pt_page_count_adj(kernel_pmap, 1); -} -SYSINIT(la57, SI_SUB_KMEM, SI_ORDER_ANY, pmap_bootstrap_la57, NULL); - /* * Initialize a vm_page's machine-dependent fields. */