From owner-freebsd-virtualization@FreeBSD.ORG Mon Dec 23 22:01:46 2013 Return-Path: Delivered-To: virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8ACF9E18; Mon, 23 Dec 2013 22:01:46 +0000 (UTC) Received: from mail-qe0-x22e.google.com (mail-qe0-x22e.google.com [IPv6:2607:f8b0:400d:c02::22e]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 39A99134B; Mon, 23 Dec 2013 22:01:46 +0000 (UTC) Received: by mail-qe0-f46.google.com with SMTP id a11so5640928qen.19 for ; Mon, 23 Dec 2013 14:01:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=oydzBh5XiXoLkOPYJ1OcCXckG+JXjQymc5DeSB0rQxo=; b=B7AYsunPqmoDIjTMyBKVouR/2f5EPhWYfD/jq97HoCFpEsIWGTmQlE2/fzfVzYSAGc sH4j6jFUVDHA+OwayddihVQPAUvoB7HD2zSq+uZgtv2s+O4bjq0cMYYAQB8oAmOwb/Bi +A2hOmXbI/UVCYQTtUpKPdc9gf5mfkqthCKADHa27pEQjBkG9hs5PCP/MFhplzZydEOx g0dNM5WHIM+D1CWoStHqeILP7g3ZA5Be9yRg1D77qSNOfkFMxlT0H8HFXQhINEH9ZmnR TiLqiIyKuESqCjU5QR5DtvxMYURYQ/+sfmfsIXh3fswPFJ4aOk6y9aieRNq4FQ9BEdDE X0HA== MIME-Version: 1.0 X-Received: by 10.224.124.134 with SMTP id u6mr46023045qar.79.1387836105378; Mon, 23 Dec 2013 14:01:45 -0800 (PST) Received: by 10.140.34.17 with HTTP; Mon, 23 Dec 2013 14:01:45 -0800 (PST) In-Reply-To: <201312231443.39282.jhb@freebsd.org> References: <201312231443.39282.jhb@freebsd.org> Date: Mon, 23 Dec 2013 14:01:45 -0800 Message-ID: Subject: Re: [PATCH] Support for S5 (soft power off) in bhyve From: Neel Natu To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 Cc: "freebsd-virtualization@freebsd.org" X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Dec 2013 22:01:46 -0000 Hi John, This looks good - thanks for adding this support! I noticed that the RESET_REGISTER is not being advertised in the FACP flags field. So, perhaps the access to 0xCF9 that you saw during testing was triggered by cpu_reset_real() as opposed to AcpiReset()? best Neel On Mon, Dec 23, 2013 at 11:43 AM, John Baldwin wrote: > One minor nit I've found annoying is that there is no easy way to break out of > the loop in vmrun.sh. Breaking into the loader prompt and typing 'reboot' > does work, but it requires you to be on the console at the time. What I > really want is for 'shutdown -p' or 'poweroff' inside a guest to cleanly shut > down and exit the loop in vmrun.sh. To that end, I've implemented support for > a few more registers (such of which are non-optional in ACPI) including the > Reset Control register (0xcf9), ACPI Power Management 1 Event registers > (PM1_EVT) and the ACPI Power Management 1 Control register (PM1_CNT). I added > a valid _S5 package and catch writes of the value _S5 specifies to ask bhyve > to exit gracefully but with an exit code of 1 (so the loop in vmrun > terminates). Most of the PM1 support is very simple (it doesn't support > signalling anything for external events as most of them don't make sense). It > does perform a reset for writes to the 0xcf9 register however, and also > advertises 0xcf9 as the ACPI reset register. (Tested by commenting out the > current 0x64 hack.) > > One thing I had to do was make it possible for an I/O port to request a reset > or poweroff, so I added some #define's for additional return codes from > in/out handlers. I haven't gone through and changed any existing handlers to > use the new constants, but the new handlers make use of INOT_RESET and > INOUT_POWEROFF. Note that this also means that the current 0x64 hack could be > reimplemented as an in-out handler rather than a special case. I did not > explicitly catch VMEXIT_POWEROFF and use a custom exit value in the main loop > in bhyverun.c. Instead, I let it fall through to the default and do an > exit(1). The patch is all in bhyve itself: > > --- //depot/vendor/freebsd/src/usr.sbin/bhyve/Makefile > +++ //depot/user/jhb/bhyve/usr.sbin/bhyve/Makefile > @@ -10,7 +10,7 @@ > SRCS= acpi.c atpic.c bhyverun.c block_if.c consport.c dbgport.c elcr.c > SRCS+= inout.c legacy_irq.c mem.c mevent.c mptbl.c pci_ahci.c > SRCS+= pci_emul.c pci_hostbridge.c pci_lpc.c pci_passthru.c pci_virtio_block.c > -SRCS+= pci_virtio_net.c pci_uart.c pit_8254.c pmtmr.c post.c rtc.c > +SRCS+= pci_virtio_net.c pci_uart.c pit_8254.c pm.c pmtmr.c post.c rtc.c > SRCS+= uart_emul.c virtio.c xmsr.c spinup_ap.c > > .PATH: ${.CURDIR}/../../sys/amd64/vmm > --- //depot/vendor/freebsd/src/usr.sbin/bhyve/acpi.c > +++ //depot/user/jhb/bhyve/usr.sbin/bhyve/acpi.c > @@ -85,6 +85,8 @@ > #define BHYVE_ASL_SUFFIX ".aml" > #define BHYVE_ASL_COMPILER "/usr/sbin/iasl" > > +#define BHYVE_PM1A_EVT_ADDR 0x400 > +#define BHYVE_PM1A_CNT_ADDR 0x404 > #define BHYVE_PM_TIMER_ADDR 0x408 > > static int basl_keep_temps; > @@ -332,9 +344,11 @@ > EFPRINTF(fp, "[0001]\t\tACPI Disable Value : 00\n"); > EFPRINTF(fp, "[0001]\t\tS4BIOS Command : 00\n"); > EFPRINTF(fp, "[0001]\t\tP-State Control : 00\n"); > - EFPRINTF(fp, "[0004]\t\tPM1A Event Block Address : 00000000\n"); > + EFPRINTF(fp, "[0004]\t\tPM1A Event Block Address : %08X\n", > + BHYVE_PM1A_EVT_ADDR); > EFPRINTF(fp, "[0004]\t\tPM1B Event Block Address : 00000000\n"); > - EFPRINTF(fp, "[0004]\t\tPM1A Control Block Address : 00000000\n"); > + EFPRINTF(fp, "[0004]\t\tPM1A Control Block Address : %08X\n", > + BHYVE_PM1A_CNT_ADDR); > EFPRINTF(fp, "[0004]\t\tPM1B Control Block Address : 00000000\n"); > EFPRINTF(fp, "[0004]\t\tPM2 Control Block Address : 00000000\n"); > EFPRINTF(fp, "[0004]\t\tPM Timer Block Address : %08X\n", > @@ -397,10 +411,10 @@ > EFPRINTF(fp, "[0001]\t\tBit Width : 08\n"); > EFPRINTF(fp, "[0001]\t\tBit Offset : 00\n"); > EFPRINTF(fp, "[0001]\t\tEncoded Access Width : 01 [Byte Access:8]\n"); > - EFPRINTF(fp, "[0008]\t\tAddress : 0000000000000001\n"); > + EFPRINTF(fp, "[0008]\t\tAddress : 0000000000000CF9\n"); > EFPRINTF(fp, "\n"); > > - EFPRINTF(fp, "[0001]\t\tValue to cause reset : 00\n"); > + EFPRINTF(fp, "[0001]\t\tValue to cause reset : 06\n"); > EFPRINTF(fp, "[0003]\t\tReserved : 000000\n"); > EFPRINTF(fp, "[0008]\t\tFACS Address : 00000000%08X\n", > basl_acpi_base + FACS_OFFSET); > @@ -412,7 +426,8 @@ > EFPRINTF(fp, "[0001]\t\tBit Width : 20\n"); > EFPRINTF(fp, "[0001]\t\tBit Offset : 00\n"); > EFPRINTF(fp, "[0001]\t\tEncoded Access Width : 02 [Word Access:16]\n"); > - EFPRINTF(fp, "[0008]\t\tAddress : 0000000000000001\n"); > + EFPRINTF(fp, "[0008]\t\tAddress : 00000000%08X\n", > + BHYVE_PM1A_EVT_ADDR); > EFPRINTF(fp, "\n"); > > EFPRINTF(fp, > @@ -431,7 +446,8 @@ > EFPRINTF(fp, "[0001]\t\tBit Width : 10\n"); > EFPRINTF(fp, "[0001]\t\tBit Offset : 00\n"); > EFPRINTF(fp, "[0001]\t\tEncoded Access Width : 02 [Word Access:16]\n"); > - EFPRINTF(fp, "[0008]\t\tAddress : 0000000000000001\n"); > + EFPRINTF(fp, "[0008]\t\tAddress : 00000000%08X\n", > + BHYVE_PM1A_CNT_ADDR); > EFPRINTF(fp, "\n"); > > EFPRINTF(fp, > @@ -603,6 +619,11 @@ > EFPRINTF(fp, "DefinitionBlock (\"bhyve_dsdt.aml\", \"DSDT\", 2," > "\"BHYVE \", \"BVDSDT \", 0x00000001)\n"); > EFPRINTF(fp, "{\n"); > + EFPRINTF(fp, " Name (_S5, Package (0x02)\n"); > + EFPRINTF(fp, " {\n"); > + EFPRINTF(fp, " 0x05,\n"); > + EFPRINTF(fp, " Zero,\n"); > + EFPRINTF(fp, " })\n"); > EFPRINTF(fp, " Scope (_SB)\n"); > EFPRINTF(fp, " {\n"); > EFPRINTF(fp, " Device (PCI0)\n"); > --- //depot/vendor/freebsd/src/usr.sbin/bhyve/bhyverun.c > +++ //depot/user/jhb/bhyve/usr.sbin/bhyve/bhyverun.c > @@ -72,6 +72,7 @@ > #define VMEXIT_RESTART 2 /* restart current instruction */ > #define VMEXIT_ABORT 3 /* abort the vm run loop */ > #define VMEXIT_RESET 4 /* guest machine has reset */ > +#define VMEXIT_POWEROFF 5 /* guest machine has powered off */ > > #define MB (1024UL * 1024) > #define GB (1024UL * MB) > @@ -296,12 +297,17 @@ > return (vmexit_handle_notify(ctx, vme, pvcpu, eax)); > > error = emulate_inout(ctx, vcpu, in, port, bytes, &eax, strictio); > - if (error == 0 && in) > + if (error == INOUT_OK && in) > error = vm_set_register(ctx, vcpu, VM_REG_GUEST_RAX, eax); > > - if (error == 0) > + switch (error) { > + case INOUT_OK: > return (VMEXIT_CONTINUE); > - else { > + case INOUT_RESET: > + return (VMEXIT_RESET); > + case INOUT_POWEROFF: > + return (VMEXIT_POWEROFF); > + default: > fprintf(stderr, "Unhandled %s%c 0x%04x\n", > in ? "in" : "out", > bytes == 1 ? 'b' : (bytes == 2 ? 'w' : 'l'), port); > --- //depot/vendor/freebsd/src/usr.sbin/bhyve/inout.h > +++ //depot/user/jhb/bhyve/usr.sbin/bhyve/inout.h > @@ -33,6 +33,12 @@ > > struct vmctx; > > +/* Handler return values. */ > +#define INOUT_ERROR -1 > +#define INOUT_OK 0 > +#define INOUT_RESET 1 > +#define INOUT_POWEROFF 2 > + > typedef int (*inout_func_t)(struct vmctx *ctx, int vcpu, int in, int port, > int bytes, uint32_t *eax, void *arg); > > --- /dev/null > +++ //depot/user/jhb/bhyve/usr.sbin/bhyve/pm.c > @@ -0,0 +1,139 @@ > +/*- > + * Copyright (c) 2013 Advanced Computing Technologies LLC > + * Written by: John H. Baldwin > + * All rights reserved. > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions > + * are met: > + * 1. Redistributions of source code must retain the above copyright > + * notice, this list of conditions and the following disclaimer. > + * 2. Redistributions in binary form must reproduce the above copyright > + * notice, this list of conditions and the following disclaimer in the > + * documentation and/or other materials provided with the distribution. > + * > + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND > + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE > + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE > + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE > + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL > + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS > + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) > + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT > + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY > + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF > + * SUCH DAMAGE. > + */ > + > +#include > +__FBSDID("$FreeBSD"); > + > +#include > + > +#include "inout.h" > + > +#define PM1A_EVT_ADDR 0x400 > +#define PM1A_CNT_ADDR 0x404 > + > +/* > + * Reset Control register at I/O port 0xcf9. Bit 2 forces a system > + * reset when it transitions from 0 to 1. Bit 1 selects the type of > + * reset to attempt: 0 selects a "soft" reset, and 1 selects a "hard" > + * reset. > + */ > +static int > +reset_handler(struct vmctx *ctx, int vcpu, int in, int port, int bytes, > + uint32_t *eax, void *arg) > +{ > + static uint8_t reset_control; > + > + if (bytes != 1) > + return (-1); > + if (in) > + *eax = reset_control; > + else { > + reset_control = *eax; > + > + /* Treat hard and soft resets the same. */ > + if (reset_control & 0x4) > + return (INOUT_RESET); > + } > + return (0); > +} > +INOUT_PORT(reset_reg, 0xCF9, IOPORT_F_INOUT, reset_handler); > + > +/* > + * Power Management 1 Event Registers > + * > + * bhyve doesn't support any power management events currently, so the > + * status register always returns zero. The enable register preserves > + * its value but has no effect. > + */ > +static int > +pm1_status_handler(struct vmctx *ctx, int vcpu, int in, int port, int bytes, > + uint32_t *eax, void *arg) > +{ > + > + if (bytes != 2) > + return (-1); > + if (in) > + *eax = 0; > + return (0); > +} > + > +static int > +pm1_enable_handler(struct vmctx *ctx, int vcpu, int in, int port, int bytes, > + uint32_t *eax, void *arg) > +{ > + static uint16_t pm1_enable; > + > + if (bytes != 2) > + return (-1); > + if (in) > + *eax = pm1_enable; > + else > + pm1_enable = *eax; > + return (0); > +} > +INOUT_PORT(pm1_status, PM1A_EVT_ADDR, IOPORT_F_INOUT, pm1_status_handler); > +INOUT_PORT(pm1_enable, PM1A_EVT_ADDR + 2, IOPORT_F_INOUT, pm1_enable_handler); > + > +/* > + * Power Management 1 Control Register > + * > + * This is mostly unimplemented except that we wish to handle writes that > + * set SPL_EN to handle S5 (soft power off). > + */ > +#define PM1_SLP_TYP 0x1c00 > +#define PM1_SLP_EN 0x2000 > +#define PM1_ALWAYS_ZERO 0xc003 > + > +static int > +pm1_control_handler(struct vmctx *ctx, int vcpu, int in, int port, int bytes, > + uint32_t *eax, void *arg) > +{ > + static uint16_t pm1_control; > + > + if (bytes != 2) > + return (-1); > + if (in) > + *eax = pm1_control; > + else { > + /* > + * Various bits are write-only or reserved, so force them > + * to zero in pm1_control. > + */ > + pm1_control = *eax & ~(PM1_SLP_EN | PM1_ALWAYS_ZERO); > + > + /* > + * If SLP_EN is set, check for S5. Bhyve's _S5_ method > + * says that '5' should be stored in SLP_TYP for S5. > + */ > + if (*eax & PM1_SLP_EN) { > + if ((pm1_control & PM1_SLP_TYP) >> 10 == 5) > + return (INOUT_POWEROFF); > + } > + } > + return (0); > +} > +INOUT_PORT(pm1_control, PM1A_CNT_ADDR, IOPORT_F_INOUT, pm1_control_handler); > + > > > -- > John Baldwin > _______________________________________________ > freebsd-virtualization@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization > To unsubscribe, send any mail to "freebsd-virtualization-unsubscribe@freebsd.org"