Date: Wed, 31 Jul 2019 20:31:36 +0000 (UTC) From: Leandro Lupori <luporl@FreeBSD.org> To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r350485 - head/stand/powerpc/ofw Message-ID: <201907312031.x6VKVaaq097695@repo.freebsd.org>
next in thread | raw e-mail | index | archive | help
Author: luporl Date: Wed Jul 31 20:31:36 2019 New Revision: 350485 URL: https://svnweb.freebsd.org/changeset/base/350485 Log: [PPC64] Implement CAS Guest PPC OSs running under a hypervisor may communicate the features they support, in order for the hypervisor to expose a virtualized machine in the way the client (guest OS) expects (see LoPAPR 1.1 - B.6.2.3). This is done by calling the "/ibm,client-architecture-support" (CAS) method, informing supported features in option vectors. Until now, FreeBSD wasn't using CAS, but instead relied on hypervisor/QEMU's defaults. The problem is that, without CAS, it is very inconvenient to run POWER9 VMs on a POWER9 host running with radix enabled. This happens because, in this case, the QEMU default is to present the guest OS a dual MMU (HPT/RPT), instead of presenting a regular HPT MMU, as FreeBSD expects, resulting in an early panic. The known workarounds required either changing the host to disable radix or passing a flag to QEMU to run in a POWER8 compatible mode. With CAS, FreeBSD is now able to communicate that it wants an HPT MMU, independent of the host setup, which now makes FreeBSD work on POWER9/pseries, with KVM enabled and without hugepages (support added in a previous commit). As CAS is invoked through OpenFirmware's call-method interface, it needs to be performed early, when OpenFirmware is still operational. Besides, now that FDT is the default way to inspect the device tree on PPC, OFW call-method feature will be unavailable by default, when control is passed to the kernel. Because of this, the call to CAS is being performed at the loader, instead of at the kernel. To avoid regressions with old platforms, this change uses CAS only on POWER8/POWER9. Reviewed by: jhibbits Differential Revision: https://reviews.freebsd.org/D20827 Added: head/stand/powerpc/ofw/cas.c (contents, props changed) Modified: head/stand/powerpc/ofw/Makefile head/stand/powerpc/ofw/main.c Modified: head/stand/powerpc/ofw/Makefile ============================================================================== --- head/stand/powerpc/ofw/Makefile Wed Jul 31 20:23:10 2019 (r350484) +++ head/stand/powerpc/ofw/Makefile Wed Jul 31 20:31:36 2019 (r350485) @@ -25,6 +25,11 @@ SRCS+= ucmpdi2.c SRCS+= ofwfdt.c .endif +.if ${MACHINE_ARCH} == "powerpc64" +SRCS+= cas.c +CFLAGS+= -DCAS +.endif + HELP_FILES= ${FDTSRC}/help.fdt # Always add MI sources @@ -34,7 +39,7 @@ HELP_FILES= ${FDTSRC}/help.fdt # load address. set in linker script RELOC?= 0x1C00000 -CFLAGS+= -DRELOC=${RELOC} +CFLAGS+= -DRELOC=${RELOC} -g LDFLAGS= -nostdlib -static -T ${.CURDIR}/ldscript.powerpc Added: head/stand/powerpc/ofw/cas.c ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ head/stand/powerpc/ofw/cas.c Wed Jul 31 20:31:36 2019 (r350485) @@ -0,0 +1,225 @@ +/*- + * Copyright (c) 2019 Leandro Lupori + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <openfirm.h> +#include <stand.h> + +/* PVR */ +#define PVR_VER_P8E 0x004b0000 +#define PVR_VER_P8NVL 0x004c0000 +#define PVR_VER_P8 0x004d0000 +#define PVR_VER_P9 0x004e0000 +#define PVR_VER_MASK 0xffff0000 + +/* loader version of kernel's CPU_MAXSIZE */ +#define MAX_CPUS ((uint32_t)256u) + +/* Option Vectors' settings */ + +/* length of ignored OV */ +#define OV_IGN_LEN 0 + +/* byte 1 (of any OV) */ +#define OV_IGN 0x80 + +/* Option Vector 5 */ + +/* byte 2 */ +#define OV5_LPAR 0x80 +#define OV5_SPLPAR 0x40 +#define OV5_DRMEM 0x20 +#define OV5_LP 0x10 +#define OV5_ALPHA_PART 0x08 +#define OV5_DMA_DELAY 0x04 +#define OV5_DONATE_CPU 0x02 +#define OV5_MSI 0x01 + +/* 9-12: max cpus */ +#define OV5_MAX_CPUS(n) ((MAX_CPUS >> (3*8 - (n)*8)) & 0xff) + +/* 13-14: LoPAPR Level */ +#define LOPAPR_LEVEL 0x0101 /* 1.1 */ +#define OV5_LOPAPR_LEVEL(n) ((LOPAPR_LEVEL >> (8 - (n)*8)) & 0xff) + +/* byte 17: Platform Facilities */ +#define OV5_RNG 0x80 +#define OV5_COMP_ENG 0x40 +#define OV5_ENC_ENG 0x20 + +/* byte 21: Sub-Processors */ +#define OV5_NO_SUBPROCS 0 +#define OV5_SUBPROCS 1 + +/* byte 23: interrupt controller */ +#define OV5_INTC_XICS 0 + +/* byte 24: MMU */ +#define OV5_MMU_HPT 0 + +/* byte 25: HPT MMU Extensions */ +#define OV5_HPT_EXT_NONE 0 + +/* byte 26: Radix MMU Extensions */ +#define OV5_RPT_EXT_NONE 0 + + +struct pvr { + uint32_t mask; + uint32_t val; +}; + +struct opt_vec_ignore { + char data[2]; +} __packed; + +struct opt_vec4 { + char data[3]; +} __packed; + +struct opt_vec5 { + char data[27]; +} __packed; + +static struct ibm_arch_vec { + struct pvr pvr_list[5]; + uint8_t num_opts; + struct opt_vec_ignore vec1; + struct opt_vec_ignore vec2; + struct opt_vec_ignore vec3; + struct opt_vec4 vec4; + struct opt_vec5 vec5; +} __packed ibm_arch_vec = { + /* pvr_list */ { + { PVR_VER_MASK, PVR_VER_P8 }, /* POWER8 */ + { PVR_VER_MASK, PVR_VER_P8E }, /* POWER8E */ + { PVR_VER_MASK, PVR_VER_P8NVL }, /* POWER8NVL */ + { PVR_VER_MASK, PVR_VER_P9 }, /* POWER9 */ + { 0, 0xffffffffu } /* terminator */ + }, + 4, /* num_opts (4 actually means 5 option vectors) */ + { OV_IGN_LEN, OV_IGN }, /* OV1 */ + { OV_IGN_LEN, OV_IGN }, /* OV2 */ + { OV_IGN_LEN, OV_IGN }, /* OV3 */ + /* OV4 (can't be ignored) */ { + sizeof(struct opt_vec4) - 2, /* length (n-2) */ + 0, + 10 /* Minimum VP entitled capacity percentage * 100 + * (if absent assume 10%) */ + }, + /* OV5 */ { + sizeof(struct opt_vec5) - 2, /* length (n-2) */ + 0, /* don't ignore */ + OV5_LPAR | OV5_SPLPAR | OV5_LP | OV5_MSI, + 0, + 0, /* Cooperative Memory Over-commitment */ + 0, /* Associativity Information Option */ + 0, /* Binary Option Controls */ + 0, /* Reserved */ + 0, /* Reserved */ + OV5_MAX_CPUS(0), + OV5_MAX_CPUS(1), /* 10 */ + OV5_MAX_CPUS(2), + OV5_MAX_CPUS(3), + OV5_LOPAPR_LEVEL(0), + OV5_LOPAPR_LEVEL(1), + 0, /* Reserved */ + 0, /* Reserved */ + 0, /* Platform Facilities */ + 0, /* Reserved */ + 0, /* Reserved */ + 0, /* Reserved */ /* 20 */ + OV5_NO_SUBPROCS, + 0, /* DRMEM_V2 */ + OV5_INTC_XICS, + OV5_MMU_HPT, + OV5_HPT_EXT_NONE, + OV5_RPT_EXT_NONE + } +}; + +static __inline register_t +mfpvr(void) +{ + register_t value; + + __asm __volatile ("mfpvr %0" : "=r"(value)); + + return (value); +} + +static __inline int +ppc64_hv(void) +{ + int hv; + + /* PSL_HV is bit 3 of 64-bit MSR */ + __asm __volatile ("mfmsr %0\n\t" + "rldicl %0,%0,4,63" : "=r"(hv)); + + return (hv); +} + +int +ppc64_cas(void) +{ + int rc; + ihandle_t ihandle; + cell_t err; + + /* Skip CAS when running on PowerNV */ + if (!ppc64_hv()) + return (0); + + /* Perform CAS only for POWER8 and later cores */ + switch (mfpvr() & PVR_VER_MASK) { + case PVR_VER_P8: + case PVR_VER_P8E: + case PVR_VER_P8NVL: + case PVR_VER_P9: + break; + default: + return (0); + } + + ihandle = OF_open("/"); + if (ihandle == -1) { + printf("cas: failed to open / node\n"); + return (-1); + } + + if (rc = OF_call_method("ibm,client-architecture-support", + ihandle, 1, 1, &ibm_arch_vec, &err)) + printf("cas: failed to call CAS method\n"); + else if (err) { + printf("cas: error: 0x%08lX\n", err); + rc = -1; + } + + OF_close(ihandle); + return (rc); +} Modified: head/stand/powerpc/ofw/main.c ============================================================================== --- head/stand/powerpc/ofw/main.c Wed Jul 31 20:23:10 2019 (r350484) +++ head/stand/powerpc/ofw/main.c Wed Jul 31 20:31:36 2019 (r350485) @@ -89,6 +89,21 @@ memsize(void) return (memsz); } +#ifdef CAS +extern int ppc64_cas(void); + +static int +ppc64_autoload(void) +{ + const char *cas; + + if ((cas = getenv("cas")) && cas[0] == '1') + if (ppc64_cas() != 0) + return (-1); + return (ofw_autoload()); +} +#endif + int main(int (*openfirm)(void *)) { @@ -169,7 +184,12 @@ main(int (*openfirm)(void *)) archsw.arch_copyin = ofw_copyin; archsw.arch_copyout = ofw_copyout; archsw.arch_readin = ofw_readin; +#ifdef CAS + setenv("cas", "1", 0); + archsw.arch_autoload = ppc64_autoload; +#else archsw.arch_autoload = ofw_autoload; +#endif interact(); /* doesn't return */
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201907312031.x6VKVaaq097695>