From owner-svn-src-stable-11@freebsd.org Thu Mar 30 12:41:22 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 591E8D2289D; Thu, 30 Mar 2017 12:41:22 +0000 (UTC) (envelope-from dexuan@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 33BFC11D; Thu, 30 Mar 2017 12:41:22 +0000 (UTC) (envelope-from dexuan@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v2UCfL0j062335; Thu, 30 Mar 2017 12:41:21 GMT (envelope-from dexuan@FreeBSD.org) Received: (from dexuan@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v2UCfLKx062334; Thu, 30 Mar 2017 12:41:21 GMT (envelope-from dexuan@FreeBSD.org) Message-Id: <201703301241.v2UCfLKx062334@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: dexuan set sender to dexuan@FreeBSD.org using -f From: Dexuan Cui Date: Thu, 30 Mar 2017 12:41:21 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r316272 - stable/11/sys/boot/efi/loader X-SVN-Group: stable-11 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Mar 2017 12:41:22 -0000 Author: dexuan Date: Thu Mar 30 12:41:21 2017 New Revision: 316272 URL: https://svnweb.freebsd.org/changeset/base/316272 Log: MFC: 314547, 314770, 314828, 314891, 314956, 314962, 315235 r314547 loader.efi: reduce the size of the staging area if necessary The loader assumes physical memory in [2MB, 2MB + EFI_STAGING_SIZE) is Conventional Memory, but actually it may not, e.g. in the case of Hyper-V Generation-2 VM (i.e. UEFI VM) running on Windows Server 2012 R2 host, there is a BootServiceData memory block at the address 47.449MB and the memory is not writable. Without the patch, the loader will crash in efi_copy_finish(): see PR 211746. The patch verifies the end of the staging area, and reduces its size if necessary. This way, the loader will not try to write into the BootServiceData memory any longer. Thank Marcel Moolenaar for helping me on this issue! The patch also allocates the staging area in the first 1GB memory. See the comment in the patch for this. PR: 211746 Reviewed by: marcel, kib, sephe Approved by: sephe (mentor) Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D9686 r314770 loader.efi: fix recent UEFI-boot regression on physical machines This patch fixes my recent patch "loader.efi: reduce the size of the staging area if necessary", which causes EFI-boot failure on physical machines since Mar 2: on the host there is a 1MB LoaderData memory range, which splits the big Conventional Memory range into a small one (15MB) and a big one: the small one is too small to hold the staging area. We can actually use the LoaderData range safely, because when amd64_tramp -> efi_copy_finish() starts to run, we're almost at the very end of the efi loader code and we're going to "return" to the kernel entry, so we're pretty sure we won't access any loader data any more. For people who are interested in the details: please see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211746#c22 PS, some people also reported the regression happened to FreeBSD VM running on Bhyve in EFI mode. This patch should resolve it too, though I don't have such a setup to test. Reviewed by: sephe Approved by: sephe (mentor) Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D9904 r314828 loader.efi: fix an off-by-one bug in efi_verify_staging_size() Also remove the warning message: it may not be unusual to see the memory range containing 2MB is not of EfiConventionalMemory. Sponsored by: Microsoft r314891 loader.efi: finally fix the off-by-one bug in efi_verify_staging_size() r314828(loader.efi: fix an off-by-one bug in efi_verify_staging_size()) doesn't really fix the bug and this patch adds the missing part. It's a shame that I didn't make everything correct at the very beginning... Sponsored by: Microsoft r314956 loader.efi: only reduce the size of the staging area on Hyper-V Doing this on physical hosts turns out to be problematic, e.g. see comment 24 and 28 in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211746. To fix the real underlying issue correctly & thoroughly, IMO we need a relocatable kernel, but that would require a lot of complicated long term work: https://reviews.freebsd.org/D9686?id=25414#inline-56969 For now, let's only apply efi_verify_staging_size() to VMs running on Hyper-V, and restore the old behavior on physical machines since that has been working for people for a long period of time, though that's potentially unsafe... Sponsored by: Microsoft r314962 loader.efi: only include the machine/ header files on x86 The 2 files may not exist on other archs like aarch64 and hence we can have a build failure there. Reported by: lwhsu Sponsored by: Microsoft r315235 loader.efi: use stricter check for Hyper-V Some other hypervisors like Xen can pretend to be Hyper-V but obviously they can't implement all Hyper-V features. Let's make sure we're genuine Hyper-V here. Also fix some minor coding style issues. PR: 211746 Sponsored by: Microsoft PR: 211746 Modified: stable/11/sys/boot/efi/loader/copy.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/boot/efi/loader/copy.c ============================================================================== --- stable/11/sys/boot/efi/loader/copy.c Thu Mar 30 12:35:56 2017 (r316271) +++ stable/11/sys/boot/efi/loader/copy.c Thu Mar 30 12:41:21 2017 (r316272) @@ -39,12 +39,135 @@ __FBSDID("$FreeBSD$"); #include "loader_efi.h" +#if defined(__i386__) || defined(__amd64__) +#include +#include + +/* + * The code is excerpted from sys/x86/x86/identcpu.c: identify_cpu(), + * identify_hypervisor(), and dev/hyperv/vmbus/hyperv.c: hyperv_identify(). + */ +#define CPUID_LEAF_HV_MAXLEAF 0x40000000 +#define CPUID_LEAF_HV_INTERFACE 0x40000001 +#define CPUID_LEAF_HV_FEATURES 0x40000003 +#define CPUID_LEAF_HV_LIMITS 0x40000005 +#define CPUID_HV_IFACE_HYPERV 0x31237648 /* HV#1 */ +#define CPUID_HV_MSR_TIME_REFCNT 0x0002 /* MSR_HV_TIME_REF_COUNT */ +#define CPUID_HV_MSR_HYPERCALL 0x0020 + +static int +running_on_hyperv(void) +{ + char hv_vendor[16]; + uint32_t regs[4]; + + do_cpuid(1, regs); + if ((regs[2] & CPUID2_HV) == 0) + return (0); + + do_cpuid(CPUID_LEAF_HV_MAXLEAF, regs); + if (regs[0] < CPUID_LEAF_HV_LIMITS) + return (0); + + ((uint32_t *)&hv_vendor)[0] = regs[1]; + ((uint32_t *)&hv_vendor)[1] = regs[2]; + ((uint32_t *)&hv_vendor)[2] = regs[3]; + hv_vendor[12] = '\0'; + if (strcmp(hv_vendor, "Microsoft Hv") != 0) + return (0); + + do_cpuid(CPUID_LEAF_HV_INTERFACE, regs); + if (regs[0] != CPUID_HV_IFACE_HYPERV) + return (0); + + do_cpuid(CPUID_LEAF_HV_FEATURES, regs); + if ((regs[0] & CPUID_HV_MSR_HYPERCALL) == 0) + return (0); + if ((regs[0] & CPUID_HV_MSR_TIME_REFCNT) == 0) + return (0); + + return (1); +} + +#define KERNEL_PHYSICAL_BASE (2*1024*1024) + +static void +efi_verify_staging_size(unsigned long *nr_pages) +{ + UINTN sz; + EFI_MEMORY_DESCRIPTOR *map, *p; + EFI_PHYSICAL_ADDRESS start, end; + UINTN key, dsz; + UINT32 dver; + EFI_STATUS status; + int i, ndesc; + unsigned long available_pages = 0; + + sz = 0; + status = BS->GetMemoryMap(&sz, 0, &key, &dsz, &dver); + if (status != EFI_BUFFER_TOO_SMALL) { + printf("Can't determine memory map size\n"); + return; + } + + map = malloc(sz); + status = BS->GetMemoryMap(&sz, map, &key, &dsz, &dver); + if (EFI_ERROR(status)) { + printf("Can't read memory map\n"); + goto out; + } + + ndesc = sz / dsz; + for (i = 0, p = map; i < ndesc; + i++, p = NextMemoryDescriptor(p, dsz)) { + start = p->PhysicalStart; + end = start + p->NumberOfPages * EFI_PAGE_SIZE; + + if (KERNEL_PHYSICAL_BASE < start || + KERNEL_PHYSICAL_BASE >= end) + continue; + + available_pages = p->NumberOfPages - + ((KERNEL_PHYSICAL_BASE - start) >> EFI_PAGE_SHIFT); + break; + } + + if (available_pages == 0) { + printf("Can't find valid memory map for staging area!\n"); + goto out; + } + + i++; + p = NextMemoryDescriptor(p, dsz); + + for ( ; i < ndesc; + i++, p = NextMemoryDescriptor(p, dsz)) { + if (p->Type != EfiConventionalMemory && + p->Type != EfiLoaderData) + break; + + if (p->PhysicalStart != end) + break; + + end = p->PhysicalStart + p->NumberOfPages * EFI_PAGE_SIZE; + + available_pages += p->NumberOfPages; + } + + if (*nr_pages > available_pages) { + printf("Staging area's size is reduced: %ld -> %ld!\n", + *nr_pages, available_pages); + *nr_pages = available_pages; + } +out: + free(map); +} +#endif /* __i386__ || __amd64__ */ + #ifndef EFI_STAGING_SIZE #define EFI_STAGING_SIZE 64 #endif -#define STAGE_PAGES EFI_SIZE_TO_PAGES((EFI_STAGING_SIZE) * 1024 * 1024) - EFI_PHYSICAL_ADDRESS staging, staging_end; int stage_offset_set = 0; ssize_t stage_offset; @@ -54,14 +177,37 @@ efi_copy_init(void) { EFI_STATUS status; + unsigned long nr_pages; + + nr_pages = EFI_SIZE_TO_PAGES((EFI_STAGING_SIZE) * 1024 * 1024); + +#if defined(__i386__) || defined(__amd64__) + /* + * We'll decrease nr_pages, if it's too big. Currently we only + * apply this to FreeBSD VM running on Hyper-V. Why? Please see + * https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211746#c28 + */ + if (running_on_hyperv()) + efi_verify_staging_size(&nr_pages); + + /* + * The staging area must reside in the the first 1GB physical + * memory: see elf64_exec() in + * boot/efi/loader/arch/amd64/elf64_freebsd.c. + */ + staging = 1024*1024*1024; + status = BS->AllocatePages(AllocateMaxAddress, EfiLoaderData, + nr_pages, &staging); +#else status = BS->AllocatePages(AllocateAnyPages, EfiLoaderData, - STAGE_PAGES, &staging); + nr_pages, &staging); +#endif if (EFI_ERROR(status)) { printf("failed to allocate staging area: %lu\n", EFI_ERROR_CODE(status)); return (status); } - staging_end = staging + STAGE_PAGES * EFI_PAGE_SIZE; + staging_end = staging + nr_pages * EFI_PAGE_SIZE; #if defined(__aarch64__) || defined(__arm__) /*