Date: Mon, 8 Dec 2025 16:11:22 -0500 From: Joseph Ward <jbwlists@hilltopgroup.com> To: freebsd-virtualization@freebsd.org Subject: Host system crash - bhyve pci passthrough Message-ID: <783f03ff-623e-4ecc-9e37-167fc2f19826@hilltopgroup.com>
index | next in thread | raw e-mail
[-- Attachment #1 --]
I'm running FreeBSD 14.3-RELEASE-p6 on a Supermicro H11SSL motherboard
with an EPYC 7551 32-core CPU.
This system runs 2 guests under bhyve, both with pci passthough:
1. A FreeBSD system running off of a physical disk
(disk0_name="/dev/ada1", disk0_dev="custom") and 2 LSI HBAs passed
through with pci passthrough for large amounts of storage.
2. A Linux system running off a zvol on the host with a Google Coral
Edge TPU passed through.
VM #2 streams approximately 10MiB/s across the virtual network to VM #1
for storage over NFS.
With the default FreeBSD settings, the host will lock within a couple of
minutes after VM #2 boots (and sometimes during the Linux boot phase).
A error that appears on the console is:
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 23943, size:
28672 (the blkno and size change of course)
When this happens, the system is completely unresponsive, and the Linux
VM is locked as well. Sometimes I can bring it back by shutting down VM
#1 which usually remains responsive for a while, but eventually it will
also freeze.
Without PCI passthrough, there is no crash.
I've tried many things, but one config element that does seem to delay
(for up to several days) the freeze has been setting
vfs.zfs.vdev.max_active=600 in /boot/loader.conf.
Memory usage remains low before a lockup, a tiny fraction of swap is
used, iostat doesn't show unusual volume, and there's plenty of idle CPU.
I'd love to be able to identify what's actually happening so that I
could either address it via config changes or to file a defect, but I'm
unable to find any metrics that are increasing, or any other way to
trace the issue.
Does anyone either have an idea about what's going on, or know some
relevant metrics/traces that would help in IDing the issue?
Thanks in advance,
Joseph
[-- Attachment #2 --]
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body>
<p>I'm running FreeBSD 14.3-RELEASE-p6 on a Supermicro H11SSL
motherboard with an EPYC 7551 32-core CPU.</p>
<p>This system runs 2 guests under bhyve, both with pci passthough: </p>
<ol>
<li>A FreeBSD system running off of a physical disk
(disk0_name="/dev/ada1", disk0_dev="custom") and 2 LSI HBAs
passed through with pci passthrough for large amounts of
storage.</li>
<li>A Linux system running off a zvol on the host with a Google
Coral Edge TPU passed through.</li>
</ol>
<p>VM #2 streams approximately 10MiB/s across the virtual network to
VM #1 for storage over NFS.</p>
<p><br>
</p>
<p>With the default FreeBSD settings, the host will lock within a
couple of minutes after VM #2 boots (and sometimes during the
Linux boot phase). A error that appears on the console is: </p>
<p>swap_pager: indefinite wait buffer: bufobj: 0, blkno: 23943,
size: 28672 (the blkno and size change of course)</p>
<p>When this happens, the system is completely unresponsive, and the
Linux VM is locked as well. Sometimes I can bring it back by
shutting down VM #1 which usually remains responsive for a while,
but eventually it will also freeze.</p>
<p>Without PCI passthrough, there is no crash. </p>
<p><br>
</p>
<p>I've tried many things, but one config element that does seem to
delay (for up to several days) the freeze has been setting
<span style="font-family:monospace"><span
style="color:#000000;background-color:#ffffff;">vfs.zfs.vdev.max_active=600 </span></span>in
/boot/loader.conf.</p>
<p>Memory usage remains low before a lockup, a tiny fraction of swap
is used, iostat doesn't show unusual volume, and there's plenty of
idle CPU.</p>
<p>I'd love to be able to identify what's actually happening so that
I could either address it via config changes or to file a defect,
but I'm unable to find any metrics that are increasing, or any
other way to trace the issue.</p>
<p>Does anyone either have an idea about what's going on, or know
some relevant metrics/traces that would help in IDing the issue?</p>
<p>Thanks in advance,</p>
<p><br>
</p>
<p>Joseph</p>
</body>
</html>
help
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?783f03ff-623e-4ecc-9e37-167fc2f19826>
