From owner-freebsd-virtualization@FreeBSD.ORG Wed Mar 27 09:59:34 2013 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 8DF19183 for ; Wed, 27 Mar 2013 09:59:34 +0000 (UTC) (envelope-from syuu@dokukino.com) Received: from mail-pa0-f49.google.com (mail-pa0-f49.google.com [209.85.220.49]) by mx1.freebsd.org (Postfix) with ESMTP id 66993A02 for ; Wed, 27 Mar 2013 09:59:34 +0000 (UTC) Received: by mail-pa0-f49.google.com with SMTP id kp14so1813696pab.22 for ; Wed, 27 Mar 2013 02:59:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dokukino.com; s=google; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=Ifem+uC9GSVZm2yi7oSRjnQQx5cli7xMcBC2AYIW1R0=; b=JcIxQXi8hUjBaUl7UoVsXJzBkaYxfboBGGSeYYHy3SJ30ntkxzdOU8KZycOmKn3qzw tSAWGxRm1NpkbzG25bHSiMaIFCrWFKjJ+nRT79kEgmiyIggAlK8bhF0FBzKCbWMdIFT6 cYHhlKtO4wGMlOjQijajo7xw2JHKpfnu9HRIY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:cc:content-type:x-gm-message-state; bh=Ifem+uC9GSVZm2yi7oSRjnQQx5cli7xMcBC2AYIW1R0=; b=DcGJYkNY/hLuBADWas1fxRWk23RDZAFWiayap0aqVIPyMFjpUJ5mQQwFeqNIC6j4eq Nago1AwnFizuau0x7v9hw8J/NexWBwFkO7hyUAASa03CiG7TbQtK5OmecNwcFNyPRHkd VEb74hcT7tLldLU5TtRRahthJOQRmLVUkm5ETpt/jI5d8Acg3grbEIimzs3QWqFOhmVw u9+BasIa44OIsiRSeyka9mTM9xBLMsPY9WlSphru6e5SN9ThGU6ooPdNct72lOBoj6ZP rzN6OzTWUBRa/6drp+oHeFwJiIStNSORVnj/KCnAK9fVSMR588mY4U3b82vJSrZNsBPj xx0Q== X-Received: by 10.66.156.196 with SMTP id wg4mr29017380pab.23.1364378368133; Wed, 27 Mar 2013 02:59:28 -0700 (PDT) MIME-Version: 1.0 Received: by 10.68.243.41 with HTTP; Wed, 27 Mar 2013 02:58:48 -0700 (PDT) In-Reply-To: References: From: Takuya ASADA Date: Wed, 27 Mar 2013 18:58:48 +0900 Message-ID: Subject: suspend/resume on BHyVe To: Neel Natu X-Gm-Message-State: ALoCoQmQzdPDjBMizkZWURJ13ROEBHtp+WSWbHRTEBIK/d0zk3zRNmxUl6BZJzjc0zF7JHj63PwT Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "freebsd-virtualization@freebsd.org" X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Mar 2013 09:59:34 -0000 Hi, I had some discussion with Iori about the project since last year, and now I'm suggesting him to apply Google Summer of Code'13 with the project. (GSoC'13 will start next month) > For this, I think those below must be implemented. > > > > - virtual machine state command interface > > - saving registers per CPU > > - dumping physical memory > > - saving virt-io and other device emulation state > > > > > > To save registers, the sysctl used in bhyvectl (vmmctl command > > previously) is helpful, > Maybe he meant ioctl. > however, it's interface is no good because getting register value > > cause a sysctl call > > so to context-switch per one register, and for getting all registers, > > it's not efficient. > I think it's more preferable to make a struct to set or unset boolean > > fields per register, > > to tell which registers kernel should return, and kernel returns those > > state with struct vmxctx. > > > > struct vmxctx is such. > > struct vmxctx { > > register_t tmpstk[32]; /* vmx_return() stack */ > > register_t tmpstktop; > > > > register_t guest_rdi; /* Guest state */ > > register_t guest_rsi; > > : > > : It looks we don't really have to take care register values on VMCS on here, just registers on vmxctx is enough(described below). Then, how about to add vmxctx dumping ioctl? > > > > > > And, considering memory dump, /dev/vmm/vmname is a file that is a map > > of guest memory, > > so memory dump doesn't seem hard, just stop vm, write back all guest > > cache, and copy > > memory file to a regular file. > > > > Finally, I don't know much about device state, but I think there must > > some state to be saved, like > > network stack. > > > > I'm not sure I wrote former, so I appreciate your ideas and suggestions. > > > > I think that you are on the right track. > > A brute force way of figuring out all the state must be saved is to > look at all the initialization functions that are called when a vm and > a vcpu are created. So, this would be vm_create() and vcpu_init() in > the kernel module. > > There is also the hardware assist state that is maintained by the > processor (VT-x or SVM) and this includes things like guest > interruptibility, guest run state etc. I am assuming that it would be > sufficient to save the VMCS page after telling the processor to flush > any state it may be caching on chip. I think, just dump whole VMCS page after calling VMCLEAR instruction is easiest way to do this. (I also considered to dump only necessary values on VMCS by VMREAD instruction, but maybe it's easy to break guest state mistakenly, and we don't get advantage by doing that way.) Then maybe we need VMCS dumping ioctl here. There is also emulated pci bus, virtio devices and legacy isa device > state that would need to be saved by the userspace 'bhyve' process. What is the necessary operation for virtio devices to suspend/resume? Maybe dump all rings of the devices? It doesn't have registers, right? > And finally there is the matter of how to communicate with 'bhyve' > process that it needs to suspend the virtual machine and write its > state to disk - perhaps a signal would be good enough place to start. How about this idea: bhyvectl sends VM_SUSPEND ioctl. If the guests is in VMX non-root mode, VM_SUSPEND ioctl handler sends IPI to interrupt the guest thread. Then the guest thread breaks vmx_run() loop, exit to userland with exitcode VM_EXITCODE_SUSPEND. Or, if the guests is not in VM_RUN ioctl but performing userland work(such as running virtio host-side driver), maybe you just need to wait bhyve process sends VM_RUN ioctl. When bhyve sends VM_RUN ioctl, vmm.ko should not perform VMEnter. It should just returns VM_EXITCODE_SUSPEND. On both cases, vmm.ko returns VM_EXITCODE_SUSPEND at the end. Then bhyve process can perform suspend action in VM_EXITCODE_SUSPEND handler. I think this is simple. > This certainly sounds like an interesting and challenging project and > we would be happy to help in any way we can. > > best > Neel > > > Thanks, Iori. > > _______________________________________________ > > freebsd-virtualization@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization > > To unsubscribe, send any mail to " > freebsd-virtualization-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-virtualization@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization > To unsubscribe, send any mail to " > freebsd-virtualization-unsubscribe@freebsd.org"