From owner-freebsd-virtualization@FreeBSD.ORG Wed Dec 18 22:38:44 2013 Return-Path: Delivered-To: virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8D68E575 for ; Wed, 18 Dec 2013 22:38:44 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 635F71E24 for ; Wed, 18 Dec 2013 22:38:44 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id A45CBB964; Wed, 18 Dec 2013 17:38:42 -0500 (EST) From: John Baldwin To: Roger Pau =?iso-8859-1?q?Monn=E9?= Subject: Re: Panic starting a bhyve guest after resume Date: Wed, 18 Dec 2013 15:02:27 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.4-CBSD-20130906; KDE/4.5.5; amd64; ; ) References: <201312121511.38608.jhb@freebsd.org> <52AC13B1.8060402@citrix.com> In-Reply-To: <52AC13B1.8060402@citrix.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Message-Id: <201312181502.27806.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 18 Dec 2013 17:38:43 -0500 (EST) Cc: "freebsd-virtualization@freebsd.org" X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Dec 2013 22:38:44 -0000 On Saturday, December 14, 2013 3:15:45 am Roger Pau Monn=E9 wrote: > On 14/12/13 03:28, Neel Natu wrote: > > Hi John, > >=20 > > On Fri, Dec 13, 2013 at 2:09 PM, John Baldwin wrote: > >> On Thursday, December 12, 2013 4:00:08 pm Neel Natu wrote: > >>> Hi John, > >>> > >>> On Thu, Dec 12, 2013 at 12:11 PM, John Baldwin wrot= e: > >>>> If I suspend and resume my laptop and then try to start a guest afte= r=20 the > >>>> resume, I get an odd panic. It generates a privileged instruction=20 fault (in > >>>> kernel mode) for 'vmclear'. I've checked CR4 and it claims that VMX= E=20 is set. > >>>> I dont have any other ideas off the top of my head on what I should = be=20 poking > >>>> at? It looks like we read a bunch of MSRs in vmx_init(), but we don= 't=20 write > >>>> to them, and all vmx_enable() does on each CPU is set VMXE in CR4 fr= om=20 what I > >>>> can tell. > >>>> > >>> > >>> It also does a "vmxon" on each logical cpu which may also need to be > >>> done after a resume. > >> > >> Ah, yes it does. That was sufficient both for starting a new guest af= ter > >> resume and even doing a suspend/resume while a guest was active (and t= he > >> guest continued to run fine). I have a hacky patch for this. One, it > >> includes both a suspend and resume hook for VMM, though for my testing= I=20 only > >> needed a resume hook to invoke vmxon. Second, the name of vmx_resume2= () > >> is a total hack (because vmx_resume() was already taken. I think for = now > >> if I were to commit this, I'd just add the resme hook and maybe call t= he > >> Intel method vmx_reset() or vmx_restore()? > >> > >> http://people.freebsd.org/~jhb/patches/bhyve_resume.patch > >> > >=20 > > There seems to be a race after the APs are restarted and before > > 'vmm_resume_p()' where it would be problematic to execute a VMX > > instruction. > >=20 > > Perhaps we should enable VMX on each cpu before they return to the > > interrupted code? >=20 > Can you use the hook in cpususpend_handler? It's cpu_ops.cpu_resume, and > gets called on each CPU before returning from the handler. That is the right place, yes. However, I'm worried about collisions. Can = you=20 run nested VMM's under Xen? That is, can a xenhvm guest start a bhyve vmm?= =20 If so, then you would need to run both cpu_resume handlers. Also, cpu_resu= me=20 isn't run on the CPU that initiates the suspend. For now I will stick with= a dedicated vmm_resume_p hook, but we may want to revisit that at some point. =2D-=20 John Baldwin