Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 18 Dec 2013 15:02:27 -0500
From:      John Baldwin <jhb@freebsd.org>
To:        Roger Pau =?iso-8859-1?q?Monn=E9?= <roger.pau@citrix.com>
Cc:        "freebsd-virtualization@freebsd.org" <virtualization@freebsd.org>
Subject:   Re: Panic starting a bhyve guest after resume
Message-ID:  <201312181502.27806.jhb@freebsd.org>
In-Reply-To: <52AC13B1.8060402@citrix.com>
References:  <201312121511.38608.jhb@freebsd.org> <CAFgRE9HWMY_uBEawSSiXgGEiqNHV-gmWeeBoi3qe50YAt48_2w@mail.gmail.com> <52AC13B1.8060402@citrix.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Saturday, December 14, 2013 3:15:45 am Roger Pau Monn=E9 wrote:
> On 14/12/13 03:28, Neel Natu wrote:
> > Hi John,
> >=20
> > On Fri, Dec 13, 2013 at 2:09 PM, John Baldwin <jhb@freebsd.org> wrote:
> >> On Thursday, December 12, 2013 4:00:08 pm Neel Natu wrote:
> >>> Hi John,
> >>>
> >>> On Thu, Dec 12, 2013 at 12:11 PM, John Baldwin <jhb@freebsd.org> wrot=
e:
> >>>> If I suspend and resume my laptop and then try to start a guest afte=
r=20
the
> >>>> resume, I get an odd panic.  It generates a privileged instruction=20
fault (in
> >>>> kernel mode) for 'vmclear'.  I've checked CR4 and it claims that VMX=
E=20
is set.
> >>>> I dont have any other ideas off the top of my head on what I should =
be=20
poking
> >>>> at?  It looks like we read a bunch of MSRs in vmx_init(), but we don=
't=20
write
> >>>> to them, and all vmx_enable() does on each CPU is set VMXE in CR4 fr=
om=20
what I
> >>>> can tell.
> >>>>
> >>>
> >>> It also does a "vmxon" on each logical cpu which may also need to be
> >>> done after a resume.
> >>
> >> Ah, yes it does.  That was sufficient both for starting a new guest af=
ter
> >> resume and even doing a suspend/resume while a guest was active (and t=
he
> >> guest continued to run fine).  I have a hacky patch for this.  One, it
> >> includes both a suspend and resume hook for VMM, though for my testing=
 I=20
only
> >> needed a resume hook to invoke vmxon.  Second, the name of vmx_resume2=
()
> >> is a total hack (because vmx_resume() was already taken.  I think for =
now
> >> if I were to commit this, I'd just add the resme hook and maybe call t=
he
> >> Intel method vmx_reset() or vmx_restore()?
> >>
> >> http://people.freebsd.org/~jhb/patches/bhyve_resume.patch
> >>
> >=20
> > There seems to be a race after the APs are restarted and before
> > 'vmm_resume_p()' where it would be problematic to execute a VMX
> > instruction.
> >=20
> > Perhaps we should enable VMX on each cpu before they return to the
> > interrupted code?
>=20
> Can you use the hook in cpususpend_handler? It's cpu_ops.cpu_resume, and
> gets called on each CPU before returning from the handler.

That is the right place, yes.  However, I'm worried about collisions.  Can =
you=20
run nested VMM's under Xen?  That is, can a xenhvm guest start a bhyve vmm?=
 =20
If so, then you would need to run both cpu_resume handlers.  Also, cpu_resu=
me=20
isn't run on the CPU that initiates the suspend.  For now I will stick with=
 a
dedicated vmm_resume_p hook, but we may want to revisit that at some point.

=2D-=20
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201312181502.27806.jhb>