Date: Tue, 26 Aug 2025 23:51:22 -0400 From: Petru Garstea <peter.garshtja@ambient-md.com> To: virtualization@freebsd.org Subject: GPU Passthrough on FreeBSD 14.3(AMD Radeon RX 6700 XT and Debian Linux 12.11) Message-ID: <43c96438-6068-487d-b1ea-583dddf0f6e8@ambient-md.com>
next in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------fNgGG00Fd1icAbLeSVN1gK3v
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Greetings,
I’m running a *Debian Linux 12.11 VM on FreeBSD 14.3* using *bhyve*.
Inside the VM, I’ve deployed the *Docker engine* with *Ollama configured
for ROCm support*.
However, when executing an LLM, the *GPU fails to initialize correctly*,
causing the process to fail.
Please note on the bare metal this setup works fine.
The full log of this behavior is included below.
---
> kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000000000).
> kernel: [drm] PSP is resuming...
> kernel: [drm] reserve 0xa00000 from 0x82fd000000 for PSP TMR
> kernel: amdgpu 0000:00:01.0: amdgpu: RAS: optional ras ta ucode is not
> available
> kernel: amdgpu 0000:00:01.0: amdgpu: SECUREDISPLAY: securedisplay ta
> ucode is not available
> kernel: amdgpu 0000:00:01.0: amdgpu: SMU is resuming...
> kernel: amdgpu 0000:00:01.0: amdgpu: smu driver if version =
> 0x0000000e, smu fw if version = 0x00000012, smu fw program = 0,
> version = 0x00413900 (65.57.0)
> kernel: amdgpu 0000:00:01.0: amdgpu: SMU driver if version not matched
> kernel: amdgpu 0000:00:01.0: amdgpu: use vbios provided pptable
> kernel: amdgpu 0000:00:01.0: amdgpu: SMU is resumed successfully!
> kernel: [drm] DMUB hardware initialized: version=0x02020017
> kernel: [drm] kiq ring mec 2 pipe 1 q 0
> kernel: [drm] VCN decode and encode initialized successfully(under DPG
> Mode).
> kernel: [drm] JPEG decode initialized successfully.
> kernel: amdgpu 0000:00:01.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0
> on hub 0
> kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1
> on hub 0
> kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4
> on hub 0
> kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5
> on hub 0
> kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6
> on hub 0
> kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7
> on hub 0
> kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8
> on hub 0
> kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9
> on hub 0
> kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.3.1 uses VM inv eng
> 10 on hub 0
> kernel: amdgpu 0000:00:01.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11
> on hub 0
> kernel: amdgpu 0000:00:01.0: amdgpu: ring sdma0 uses VM inv eng 12 on
> hub 0
> kernel: amdgpu 0000:00:01.0: amdgpu: ring sdma1 uses VM inv eng 13 on
> hub 0
> kernel: amdgpu 0000:00:01.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0
> on hub 1
> kernel: amdgpu 0000:00:01.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng
> 1 on hub 1
> kernel: amdgpu 0000:00:01.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng
> 4 on hub 1
> kernel: amdgpu 0000:00:01.0: amdgpu: ring jpeg_dec uses VM inv eng 5
> on hub 1
> kernel: amdgpu 0000:00:01.0: [drm] Cannot find any crtc or sizes
> kernel: amdgpu: qcm fence wait loop timeout expired
> kernel: amdgpu: The cp might be in an unrecoverable state due to an
> unsuccessful queues preemption
> kernel: amdgpu: Pasid 0x8002 DQM create queue type 0 failed. ret -62
> kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset begin!
> kernel: amdgpu: Failed to suspend process 0x8002
> kernel: amdgpu: Failed to suspend process 0x8001
> kernel: amdgpu 0000:00:01.0: amdgpu: free PSP TMR buffer
> kernel: amdgpu 0000:00:01.0: amdgpu: MODE1 reset
> kernel: amdgpu 0000:00:01.0: amdgpu: GPU mode1 reset
> kernel: amdgpu 0000:00:01.0: amdgpu: GPU smu mode1 reset
> kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset succeeded, trying to resume
> kernel: clocksource: Long readout interval, skipping watchdog check:
> cs_nsec: 12622536057 wd_nsec: 12613480925
> kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000000000).
> kernel: [drm] VRAM is lost due to GPU reset!
> kernel: [drm] PSP is resuming...
> kernel: [drm:psp_hw_start [amdgpu]] *ERROR* PSP create ring failed!
> kernel: [drm:psp_resume [amdgpu]] *ERROR* PSP resume failed
> kernel: [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume of IP
> block <psp> failed -62
> kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset(1) failed
> kernel: amdgpu: qcm fence wait loop timeout expired
> kernel: amdgpu: The cp might be in an unrecoverable state due to an
> unsuccessful queues preemption
> kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset end with ret = -62
> kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset begin!
> kernel: amdgpu 0000:00:01.0: amdgpu: Failed to disallow df cstate
Regards,
Petru
--------------fNgGG00Fd1icAbLeSVN1gK3v
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: 8bit
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Greetings,</p>
<p data-start="65" data-end="237">I’m running a <strong
data-start="79" data-end="120">Debian Linux 12.11 VM on FreeBSD
14.3</strong> using <strong data-start="127" data-end="136">bhyve</strong>.<br
data-start="137" data-end="140">
Inside the VM, I’ve deployed the <strong data-start="173"
data-end="190">Docker engine</strong> with <strong
data-start="196" data-end="234" data-is-only-node="">Ollama
configured for ROCm support</strong>.</p>
<p data-start="239" data-end="395">However, when executing an LLM,
the <strong data-start="275" data-end="312">GPU fails to
initialize correctly</strong>, causing the process to fail.<br>
Please note on the bare metal this setup works fine.<br>
<br data-start="342" data-end="345">
The full log of this behavior is included below.</p>
<p>---<br>
<blockquote type="cite">kernel: [drm] PCIE GART of 512M enabled
(table at 0x0000008000000000).<br>
kernel: [drm] PSP is resuming...<br>
kernel: [drm] reserve 0xa00000 from 0x82fd000000 for PSP TMR<br>
kernel: amdgpu 0000:00:01.0: amdgpu: RAS: optional ras ta ucode
is not available<br>
kernel: amdgpu 0000:00:01.0: amdgpu: SECUREDISPLAY:
securedisplay ta ucode is not available<br>
kernel: amdgpu 0000:00:01.0: amdgpu: SMU is resuming...<br>
kernel: amdgpu 0000:00:01.0: amdgpu: smu driver if version =
0x0000000e, smu fw if version = 0x00000012, smu fw program = 0,
version = 0x00413900 (65.57.0)<br>
kernel: amdgpu 0000:00:01.0: amdgpu: SMU driver if version not
matched<br>
kernel: amdgpu 0000:00:01.0: amdgpu: use vbios provided pptable<br>
kernel: amdgpu 0000:00:01.0: amdgpu: SMU is resumed
successfully!<br>
kernel: [drm] DMUB hardware initialized: version=0x02020017<br>
kernel: [drm] kiq ring mec 2 pipe 1 q 0<br>
kernel: [drm] VCN decode and encode initialized
successfully(under DPG Mode).<br>
kernel: [drm] JPEG decode initialized successfully.<br>
kernel: amdgpu 0000:00:01.0: amdgpu: ring gfx_0.0.0 uses VM inv
eng 0 on hub 0<br>
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.0.0 uses VM inv
eng 1 on hub 0<br>
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.1.0 uses VM inv
eng 4 on hub 0<br>
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.2.0 uses VM inv
eng 5 on hub 0<br>
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.3.0 uses VM inv
eng 6 on hub 0<br>
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.0.1 uses VM inv
eng 7 on hub 0<br>
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.1.1 uses VM inv
eng 8 on hub 0<br>
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.2.1 uses VM inv
eng 9 on hub 0<br>
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.3.1 uses VM inv
eng 10 on hub 0<br>
kernel: amdgpu 0000:00:01.0: amdgpu: ring kiq_2.1.0 uses VM inv
eng 11 on hub 0<br>
kernel: amdgpu 0000:00:01.0: amdgpu: ring sdma0 uses VM inv eng
12 on hub 0<br>
kernel: amdgpu 0000:00:01.0: amdgpu: ring sdma1 uses VM inv eng
13 on hub 0<br>
kernel: amdgpu 0000:00:01.0: amdgpu: ring vcn_dec_0 uses VM inv
eng 0 on hub 1<br>
kernel: amdgpu 0000:00:01.0: amdgpu: ring vcn_enc_0.0 uses VM
inv eng 1 on hub 1<br>
kernel: amdgpu 0000:00:01.0: amdgpu: ring vcn_enc_0.1 uses VM
inv eng 4 on hub 1<br>
kernel: amdgpu 0000:00:01.0: amdgpu: ring jpeg_dec uses VM inv
eng 5 on hub 1<br>
kernel: amdgpu 0000:00:01.0: [drm] Cannot find any crtc or sizes<br>
kernel: amdgpu: qcm fence wait loop timeout expired<br>
kernel: amdgpu: The cp might be in an unrecoverable state due to
an unsuccessful queues preemption<br>
kernel: amdgpu: Pasid 0x8002 DQM create queue type 0 failed. ret
-62<br>
kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset begin!<br>
kernel: amdgpu: Failed to suspend process 0x8002<br>
kernel: amdgpu: Failed to suspend process 0x8001<br>
kernel: amdgpu 0000:00:01.0: amdgpu: free PSP TMR buffer<br>
kernel: amdgpu 0000:00:01.0: amdgpu: MODE1 reset<br>
kernel: amdgpu 0000:00:01.0: amdgpu: GPU mode1 reset<br>
kernel: amdgpu 0000:00:01.0: amdgpu: GPU smu mode1 reset<br>
kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset succeeded, trying
to resume<br>
kernel: clocksource: Long readout interval, skipping watchdog
check: cs_nsec: 12622536057 wd_nsec: 12613480925<br>
kernel: [drm] PCIE GART of 512M enabled (table at
0x0000008000000000).<br>
kernel: [drm] VRAM is lost due to GPU reset!<br>
kernel: [drm] PSP is resuming...<br>
kernel: [drm:psp_hw_start [amdgpu]] *ERROR* PSP create ring
failed!<br>
kernel: [drm:psp_resume [amdgpu]] *ERROR* PSP resume failed<br>
kernel: [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume
of IP block <psp> failed -62<br>
kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset(1) failed<br>
kernel: amdgpu: qcm fence wait loop timeout expired<br>
kernel: amdgpu: The cp might be in an unrecoverable state due to
an unsuccessful queues preemption<br>
kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset end with ret =
-62<br>
kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset begin!<br>
kernel: amdgpu 0000:00:01.0: amdgpu: Failed to disallow df
cstate</blockquote>
<br>
</p>
<p>Regards,<br>
Petru</p>
</body>
</html>
--------------fNgGG00Fd1icAbLeSVN1gK3v--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?43c96438-6068-487d-b1ea-583dddf0f6e8>
