From owner-freebsd-virtualization@FreeBSD.ORG Mon Dec 23 22:08:56 2013 Return-Path: Delivered-To: virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4D7B8FA2 for ; Mon, 23 Dec 2013 22:08:56 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id E68D51380 for ; Mon, 23 Dec 2013 22:08:55 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id F3CD1B917; Mon, 23 Dec 2013 17:08:54 -0500 (EST) From: John Baldwin To: Neel Natu Subject: Re: [PATCH] Support for S5 (soft power off) in bhyve Date: Mon, 23 Dec 2013 17:08:45 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.4-CBSD-20130906; KDE/4.5.5; amd64; ; ) References: <201312231443.39282.jhb@freebsd.org> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201312231708.45431.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 23 Dec 2013 17:08:55 -0500 (EST) Cc: "freebsd-virtualization@freebsd.org" X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Dec 2013 22:08:56 -0000 On Monday, December 23, 2013 5:01:45 pm Neel Natu wrote: > Hi John, > > This looks good - thanks for adding this support! > > I noticed that the RESET_REGISTER is not being advertised in the FACP > flags field. So, perhaps the access to 0xCF9 that you saw during > testing was triggered by cpu_reset_real() as opposed to AcpiReset()? Yeah, that might explain why I had to comment out the 0x64 check in bhyverun.c directly as AcpiReset() should take precedence before cpu_reset_real() is invoked. I'll retest that tomorrow to make sure I have the FACP correct. > best > Neel > > On Mon, Dec 23, 2013 at 11:43 AM, John Baldwin wrote: > > One minor nit I've found annoying is that there is no easy way to break out of > > the loop in vmrun.sh. Breaking into the loader prompt and typing 'reboot' > > does work, but it requires you to be on the console at the time. What I > > really want is for 'shutdown -p' or 'poweroff' inside a guest to cleanly shut > > down and exit the loop in vmrun.sh. To that end, I've implemented support for > > a few more registers (such of which are non-optional in ACPI) including the > > Reset Control register (0xcf9), ACPI Power Management 1 Event registers > > (PM1_EVT) and the ACPI Power Management 1 Control register (PM1_CNT). I added > > a valid _S5 package and catch writes of the value _S5 specifies to ask bhyve > > to exit gracefully but with an exit code of 1 (so the loop in vmrun > > terminates). Most of the PM1 support is very simple (it doesn't support > > signalling anything for external events as most of them don't make sense). It > > does perform a reset for writes to the 0xcf9 register however, and also > > advertises 0xcf9 as the ACPI reset register. (Tested by commenting out the > > current 0x64 hack.) > > > > One thing I had to do was make it possible for an I/O port to request a reset > > or poweroff, so I added some #define's for additional return codes from > > in/out handlers. I haven't gone through and changed any existing handlers to > > use the new constants, but the new handlers make use of INOT_RESET and > > INOUT_POWEROFF. Note that this also means that the current 0x64 hack could be > > reimplemented as an in-out handler rather than a special case. I did not > > explicitly catch VMEXIT_POWEROFF and use a custom exit value in the main loop > > in bhyverun.c. Instead, I let it fall through to the default and do an > > exit(1). The patch is all in bhyve itself: > > > > --- //depot/vendor/freebsd/src/usr.sbin/bhyve/Makefile > > +++ //depot/user/jhb/bhyve/usr.sbin/bhyve/Makefile > > @@ -10,7 +10,7 @@ > > SRCS= acpi.c atpic.c bhyverun.c block_if.c consport.c dbgport.c elcr.c > > SRCS+= inout.c legacy_irq.c mem.c mevent.c mptbl.c pci_ahci.c > > SRCS+= pci_emul.c pci_hostbridge.c pci_lpc.c pci_passthru.c pci_virtio_block.c > > -SRCS+= pci_virtio_net.c pci_uart.c pit_8254.c pmtmr.c post.c rtc.c > > +SRCS+= pci_virtio_net.c pci_uart.c pit_8254.c pm.c pmtmr.c post.c rtc.c > > SRCS+= uart_emul.c virtio.c xmsr.c spinup_ap.c > > > > .PATH: ${.CURDIR}/../../sys/amd64/vmm > > --- //depot/vendor/freebsd/src/usr.sbin/bhyve/acpi.c > > +++ //depot/user/jhb/bhyve/usr.sbin/bhyve/acpi.c > > @@ -85,6 +85,8 @@ > > #define BHYVE_ASL_SUFFIX ".aml" > > #define BHYVE_ASL_COMPILER "/usr/sbin/iasl" > > > > +#define BHYVE_PM1A_EVT_ADDR 0x400 > > +#define BHYVE_PM1A_CNT_ADDR 0x404 > > #define BHYVE_PM_TIMER_ADDR 0x408 > > > > static int basl_keep_temps; > > @@ -332,9 +344,11 @@ > > EFPRINTF(fp, "[0001]\t\tACPI Disable Value : 00\n"); > > EFPRINTF(fp, "[0001]\t\tS4BIOS Command : 00\n"); > > EFPRINTF(fp, "[0001]\t\tP-State Control : 00\n"); > > - EFPRINTF(fp, "[0004]\t\tPM1A Event Block Address : 00000000\n"); > > + EFPRINTF(fp, "[0004]\t\tPM1A Event Block Address : %08X\n", > > + BHYVE_PM1A_EVT_ADDR); > > EFPRINTF(fp, "[0004]\t\tPM1B Event Block Address : 00000000\n"); > > - EFPRINTF(fp, "[0004]\t\tPM1A Control Block Address : 00000000\n"); > > + EFPRINTF(fp, "[0004]\t\tPM1A Control Block Address : %08X\n", > > + BHYVE_PM1A_CNT_ADDR); > > EFPRINTF(fp, "[0004]\t\tPM1B Control Block Address : 00000000\n"); > > EFPRINTF(fp, "[0004]\t\tPM2 Control Block Address : 00000000\n"); > > EFPRINTF(fp, "[0004]\t\tPM Timer Block Address : %08X\n", > > @@ -397,10 +411,10 @@ > > EFPRINTF(fp, "[0001]\t\tBit Width : 08\n"); > > EFPRINTF(fp, "[0001]\t\tBit Offset : 00\n"); > > EFPRINTF(fp, "[0001]\t\tEncoded Access Width : 01 [Byte Access:8]\n"); > > - EFPRINTF(fp, "[0008]\t\tAddress : 0000000000000001\n"); > > + EFPRINTF(fp, "[0008]\t\tAddress : 0000000000000CF9\n"); > > EFPRINTF(fp, "\n"); > > > > - EFPRINTF(fp, "[0001]\t\tValue to cause reset : 00\n"); > > + EFPRINTF(fp, "[0001]\t\tValue to cause reset : 06\n"); > > EFPRINTF(fp, "[0003]\t\tReserved : 000000\n"); > > EFPRINTF(fp, "[0008]\t\tFACS Address : 00000000%08X\n", > > basl_acpi_base + FACS_OFFSET); > > @@ -412,7 +426,8 @@ > > EFPRINTF(fp, "[0001]\t\tBit Width : 20\n"); > > EFPRINTF(fp, "[0001]\t\tBit Offset : 00\n"); > > EFPRINTF(fp, "[0001]\t\tEncoded Access Width : 02 [Word Access:16]\n"); > > - EFPRINTF(fp, "[0008]\t\tAddress : 0000000000000001\n"); > > + EFPRINTF(fp, "[0008]\t\tAddress : 00000000%08X\n", > > + BHYVE_PM1A_EVT_ADDR); > > EFPRINTF(fp, "\n"); > > > > EFPRINTF(fp, > > @@ -431,7 +446,8 @@ > > EFPRINTF(fp, "[0001]\t\tBit Width : 10\n"); > > EFPRINTF(fp, "[0001]\t\tBit Offset : 00\n"); > > EFPRINTF(fp, "[0001]\t\tEncoded Access Width : 02 [Word Access:16]\n"); > > - EFPRINTF(fp, "[0008]\t\tAddress : 0000000000000001\n"); > > + EFPRINTF(fp, "[0008]\t\tAddress : 00000000%08X\n", > > + BHYVE_PM1A_CNT_ADDR); > > EFPRINTF(fp, "\n"); > > > > EFPRINTF(fp, > > @@ -603,6 +619,11 @@ > > EFPRINTF(fp, "DefinitionBlock (\"bhyve_dsdt.aml\", \"DSDT\", 2," > > "\"BHYVE \", \"BVDSDT \", 0x00000001)\n"); > > EFPRINTF(fp, "{\n"); > > + EFPRINTF(fp, " Name (_S5, Package (0x02)\n"); > > + EFPRINTF(fp, " {\n"); > > + EFPRINTF(fp, " 0x05,\n"); > > + EFPRINTF(fp, " Zero,\n"); > > + EFPRINTF(fp, " })\n"); > > EFPRINTF(fp, " Scope (_SB)\n"); > > EFPRINTF(fp, " {\n"); > > EFPRINTF(fp, " Device (PCI0)\n"); > > --- //depot/vendor/freebsd/src/usr.sbin/bhyve/bhyverun.c > > +++ //depot/user/jhb/bhyve/usr.sbin/bhyve/bhyverun.c > > @@ -72,6 +72,7 @@ > > #define VMEXIT_RESTART 2 /* restart current instruction */ > > #define VMEXIT_ABORT 3 /* abort the vm run loop */ > > #define VMEXIT_RESET 4 /* guest machine has reset */ > > +#define VMEXIT_POWEROFF 5 /* guest machine has powered off */ > > > > #define MB (1024UL * 1024) > > #define GB (1024UL * MB) > > @@ -296,12 +297,17 @@ > > return (vmexit_handle_notify(ctx, vme, pvcpu, eax)); > > > > error = emulate_inout(ctx, vcpu, in, port, bytes, &eax, strictio); > > - if (error == 0 && in) > > + if (error == INOUT_OK && in) > > error = vm_set_register(ctx, vcpu, VM_REG_GUEST_RAX, eax); > > > > - if (error == 0) > > + switch (error) { > > + case INOUT_OK: > > return (VMEXIT_CONTINUE); > > - else { > > + case INOUT_RESET: > > + return (VMEXIT_RESET); > > + case INOUT_POWEROFF: > > + return (VMEXIT_POWEROFF); > > + default: > > fprintf(stderr, "Unhandled %s%c 0x%04x\n", > > in ? "in" : "out", > > bytes == 1 ? 'b' : (bytes == 2 ? 'w' : 'l'), port); > > --- //depot/vendor/freebsd/src/usr.sbin/bhyve/inout.h > > +++ //depot/user/jhb/bhyve/usr.sbin/bhyve/inout.h > > @@ -33,6 +33,12 @@ > > > > struct vmctx; > > > > +/* Handler return values. */ > > +#define INOUT_ERROR -1 > > +#define INOUT_OK 0 > > +#define INOUT_RESET 1 > > +#define INOUT_POWEROFF 2 > > + > > typedef int (*inout_func_t)(struct vmctx *ctx, int vcpu, int in, int port, > > int bytes, uint32_t *eax, void *arg); > > > > --- /dev/null > > +++ //depot/user/jhb/bhyve/usr.sbin/bhyve/pm.c > > @@ -0,0 +1,139 @@ > > +/*- > > + * Copyright (c) 2013 Advanced Computing Technologies LLC > > + * Written by: John H. Baldwin > > + * All rights reserved. > > + * > > + * Redistribution and use in source and binary forms, with or without > > + * modification, are permitted provided that the following conditions > > + * are met: > > + * 1. Redistributions of source code must retain the above copyright > > + * notice, this list of conditions and the following disclaimer. > > + * 2. Redistributions in binary form must reproduce the above copyright > > + * notice, this list of conditions and the following disclaimer in the > > + * documentation and/or other materials provided with the distribution. > > + * > > + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND > > + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE > > + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE > > + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE > > + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL > > + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS > > + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) > > + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT > > + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY > > + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF > > + * SUCH DAMAGE. > > + */ > > + > > +#include > > +__FBSDID("$FreeBSD"); > > + > > +#include > > + > > +#include "inout.h" > > + > > +#define PM1A_EVT_ADDR 0x400 > > +#define PM1A_CNT_ADDR 0x404 > > + > > +/* > > + * Reset Control register at I/O port 0xcf9. Bit 2 forces a system > > + * reset when it transitions from 0 to 1. Bit 1 selects the type of > > + * reset to attempt: 0 selects a "soft" reset, and 1 selects a "hard" > > + * reset. > > + */ > > +static int > > +reset_handler(struct vmctx *ctx, int vcpu, int in, int port, int bytes, > > + uint32_t *eax, void *arg) > > +{ > > + static uint8_t reset_control; > > + > > + if (bytes != 1) > > + return (-1); > > + if (in) > > + *eax = reset_control; > > + else { > > + reset_control = *eax; > > + > > + /* Treat hard and soft resets the same. */ > > + if (reset_control & 0x4) > > + return (INOUT_RESET); > > + } > > + return (0); > > +} > > +INOUT_PORT(reset_reg, 0xCF9, IOPORT_F_INOUT, reset_handler); > > + > > +/* > > + * Power Management 1 Event Registers > > + * > > + * bhyve doesn't support any power management events currently, so the > > + * status register always returns zero. The enable register preserves > > + * its value but has no effect. > > + */ > > +static int > > +pm1_status_handler(struct vmctx *ctx, int vcpu, int in, int port, int bytes, > > + uint32_t *eax, void *arg) > > +{ > > + > > + if (bytes != 2) > > + return (-1); > > + if (in) > > + *eax = 0; > > + return (0); > > +} > > + > > +static int > > +pm1_enable_handler(struct vmctx *ctx, int vcpu, int in, int port, int bytes, > > + uint32_t *eax, void *arg) > > +{ > > + static uint16_t pm1_enable; > > + > > + if (bytes != 2) > > + return (-1); > > + if (in) > > + *eax = pm1_enable; > > + else > > + pm1_enable = *eax; > > + return (0); > > +} > > +INOUT_PORT(pm1_status, PM1A_EVT_ADDR, IOPORT_F_INOUT, pm1_status_handler); > > +INOUT_PORT(pm1_enable, PM1A_EVT_ADDR + 2, IOPORT_F_INOUT, pm1_enable_handler); > > + > > +/* > > + * Power Management 1 Control Register > > + * > > + * This is mostly unimplemented except that we wish to handle writes that > > + * set SPL_EN to handle S5 (soft power off). > > + */ > > +#define PM1_SLP_TYP 0x1c00 > > +#define PM1_SLP_EN 0x2000 > > +#define PM1_ALWAYS_ZERO 0xc003 > > + > > +static int > > +pm1_control_handler(struct vmctx *ctx, int vcpu, int in, int port, int bytes, > > + uint32_t *eax, void *arg) > > +{ > > + static uint16_t pm1_control; > > + > > + if (bytes != 2) > > + return (-1); > > + if (in) > > + *eax = pm1_control; > > + else { > > + /* > > + * Various bits are write-only or reserved, so force them > > + * to zero in pm1_control. > > + */ > > + pm1_control = *eax & ~(PM1_SLP_EN | PM1_ALWAYS_ZERO); > > + > > + /* > > + * If SLP_EN is set, check for S5. Bhyve's _S5_ method > > + * says that '5' should be stored in SLP_TYP for S5. > > + */ > > + if (*eax & PM1_SLP_EN) { > > + if ((pm1_control & PM1_SLP_TYP) >> 10 == 5) > > + return (INOUT_POWEROFF); > > + } > > + } > > + return (0); > > +} > > +INOUT_PORT(pm1_control, PM1A_CNT_ADDR, IOPORT_F_INOUT, pm1_control_handler); > > + > > > > > > -- > > John Baldwin > > _______________________________________________ > > freebsd-virtualization@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization > > To unsubscribe, send any mail to "freebsd-virtualization-unsubscribe@freebsd.org" > -- John Baldwin