Date: Fri, 06 Jan 2023 23:56:44 +0000 From: bugzilla-noreply@freebsd.org To: virtualization@FreeBSD.org Subject: [Bug 268794] Simultaneous vcpu_lock_all() and vm_handle_rendezvous() can deadlock vmm Message-ID: <bug-268794-27103@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D268794 Bug ID: 268794 Summary: Simultaneous vcpu_lock_all() and vm_handle_rendezvous() can deadlock vmm Product: Base System Version: 13.1-STABLE Hardware: amd64 OS: Any Status: New Severity: Affects Only Me Priority: --- Component: bhyve Assignee: virtualization@FreeBSD.org Reporter: crowston@protonmail.com Guest is Windows 11 22H2. This only happens with a PCI device passed-throug= h, only very early into the boot, and only if there's more than one vCPU. It d= oes not happen reliably, maybe 90% of boots. It happens even on the installer image. I am running on an AMD Ryzen 1700. This does not happen with Windows 10 nor Windows Server 2022, which suggest= s to me a recent change to the NT kernel might have exposed it. Action: 1. Windows writes to the APIC on vCPU x. 1a. That vCPU exits, and its state toggles to VCPU_FROZEN. 1b. That vCPU goes into vm_handle_inst_emul() -> emulate_mov() -> vioapic_mmio_write() -> vioapic_write() -> vm_smp_handle_rendezvous(). 1c. vm_handle_rendezvous() waits for all vCPU threads to handle the rendezv= ous. 2. Simultaneously, from userland's pci_passthru.c, either vm_map_pptdev_mmi= o() or vm_unmap_pptdev_mmio() is called. 2a. vmmdev_ioctl() invokes vcpu_lock_all(). 2b. vcpu_lock_all() iterates through the vCPUs, calling vcpu_lock_one() on = each vCPU, eventually reaching vCPU x (the APIC one). 2c. vCPU x is already in the VCPU_FROZEN state, from (1a). vcpu_set_state_locked() hangs waiting for it to transition to the VCPU_IDLE sate. 3. All the other vCPUs eventually end up either in vm_handle_rendezvous() o= r in vcpu_set_state_locked(), and hang there. It's not clear to me what the fix should be. Should we check and run the rendezvous func while waiting for the VCPU_IDLE transition in vcpu_set_state_locked()? That will presumably require a strong contract on = the kind of rendezvous functions that can be invoked. --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-268794-27103>