From owner-freebsd-virtualization@FreeBSD.ORG Mon Dec 23 16:18:40 2013 Return-Path: Delivered-To: virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0975277B for ; Mon, 23 Dec 2013 16:18:40 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id D488518BB for ; Mon, 23 Dec 2013 16:18:39 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id C3F37B941; Mon, 23 Dec 2013 11:18:38 -0500 (EST) From: John Baldwin To: Neel Natu Subject: Re: Panic starting a bhyve guest after resume Date: Fri, 20 Dec 2013 17:23:46 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.4-CBSD-20130906; KDE/4.5.5; amd64; ; ) References: <201312121511.38608.jhb@freebsd.org> <201312131709.20264.jhb@freebsd.org> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201312201723.46978.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 23 Dec 2013 11:18:38 -0500 (EST) Cc: "freebsd-virtualization@freebsd.org" X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Dec 2013 16:18:40 -0000 On Friday, December 13, 2013 9:28:29 pm Neel Natu wrote: > Hi John, > > On Fri, Dec 13, 2013 at 2:09 PM, John Baldwin wrote: > > On Thursday, December 12, 2013 4:00:08 pm Neel Natu wrote: > >> Hi John, > >> > >> On Thu, Dec 12, 2013 at 12:11 PM, John Baldwin wrote: > >> > If I suspend and resume my laptop and then try to start a guest after the > >> > resume, I get an odd panic. It generates a privileged instruction fault (in > >> > kernel mode) for 'vmclear'. I've checked CR4 and it claims that VMXE is set. > >> > I dont have any other ideas off the top of my head on what I should be poking > >> > at? It looks like we read a bunch of MSRs in vmx_init(), but we don't write > >> > to them, and all vmx_enable() does on each CPU is set VMXE in CR4 from what I > >> > can tell. > >> > > >> > >> It also does a "vmxon" on each logical cpu which may also need to be > >> done after a resume. > > > > Ah, yes it does. That was sufficient both for starting a new guest after > > resume and even doing a suspend/resume while a guest was active (and the > > guest continued to run fine). I have a hacky patch for this. One, it > > includes both a suspend and resume hook for VMM, though for my testing I only > > needed a resume hook to invoke vmxon. Second, the name of vmx_resume2() > > is a total hack (because vmx_resume() was already taken. I think for now > > if I were to commit this, I'd just add the resme hook and maybe call the > > Intel method vmx_reset() or vmx_restore()? > > > > http://people.freebsd.org/~jhb/patches/bhyve_resume.patch > > > > There seems to be a race after the APs are restarted and before > 'vmm_resume_p()' where it would be problematic to execute a VMX > instruction. > > Perhaps we should enable VMX on each cpu before they return to the > interrupted code? I've updated the patch at the URL above to do just that. This also works in my testing. -- John Baldwin