Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 30 Dec 2017 00:22:21 +0200
From:      Andriy Gapon <avg@FreeBSD.org>
To:        freebsd-virtualization@freebsd.org
Subject:   bhyve/amd: interrupt delivered when it shouldn't be?
Message-ID:  <42c22179-ae42-e4bb-e77d-a1d49fe634ed@FreeBSD.org>

next in thread | raw e-mail | index | archive | help

First, about the setup.  It's a FreeBSD/amd64 head guest on a FreeBSD/amd64 head
host.  The hardware is AMD.  The hypervisor is bhyve.

Under a certain specific load, that involves a lot of page faults and IPI-s, I
see the guest system getting stuck.  This is pretty consistent.  Typically I
find a thread spinning on smp_ipi_mtx.  And an owner of the mutex appears to be
in mi_switch() -> sched_switch().  The debugging data that I have is somewhat
flaky, but it seems that the owner is typically in this code path:

smp_targeted_tlb_shootdown -> ipi_send_cpu -> native_lapic_ipi_raw

smp_targeted_tlb_shootdown holds smp_ipi_mtx.
native_lapic_ipi_raw, in this setup, performs the following manipulations:

saveintr = intr_disable();
...
intr_restore(saveintr);

The interrupts are already disabled when this function is entered, because
smp_ipi_mtx is a spinlock and our spinlock implementation disables interrupts.
So, intr_restore() in this case should be a NOP (BTW, it's implemented via popf).

But what I see suggests that at this point a Local APIC timer interrupt gets
delivered to the thread.  And that causes all the mess as the thread holding the
spinlock gets preempted.

Does this ring a bell to anyone?
Is there any suspect code?

It seems that we set v_intr_masking bit, so the rFLAGS / eFLAGS should be
completely virtualized.  So, maybe a hardware issue?

Thank you!
-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?42c22179-ae42-e4bb-e77d-a1d49fe634ed>