From nobody Wed Aug 27 05:18:44 2025 X-Original-To: virtualization@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4cBXrB6lyvz65k1g for ; Wed, 27 Aug 2025 05:18:50 +0000 (UTC) (envelope-from Stephan.Althaus@Duedinghausen.eu) Received: from mo4-p05-ob.smtp.rzone.de (mo4-p05-ob.smtp.rzone.de [85.215.255.131]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "*.smtp.rzone.de", Issuer "Telekom Security ServerID OV Class 2 CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4cBXr90mNWz3yfd for ; Wed, 27 Aug 2025 05:18:48 +0000 (UTC) (envelope-from Stephan.Althaus@Duedinghausen.eu) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=hoewweken.de header.s=strato-dkim-0002 header.b=sBq47O0w; dkim=pass header.d=hoewweken.de header.s=strato-dkim-0003 header.b=5oIYkRy2; dkim=pass header.d=duedinghausen.eu header.s=default header.b=c3fsbCuv; dmarc=pass (policy=reject) header.from=duedinghausen.eu; spf=none (mx1.freebsd.org: domain of Stephan.Althaus@Duedinghausen.eu has no SPF policy when checking 85.215.255.131) smtp.mailfrom=Stephan.Althaus@Duedinghausen.eu; arc=pass ("strato.com:s=strato-dkim-0002:i=1") ARC-Seal: i=1; a=rsa-sha256; t=1756271926; cv=none; d=strato.com; s=strato-dkim-0002; b=qEySTxbKLU08NWH/wun/5+Q/IdlMK+Qsj5gbLEIVb1603nsVlOAs58QdpcXsScfvwr ludGydiTaYtL1O+Uu+06Aw0NYxB80BpoTERDowF33WsWcHc8gDqfdPdYGL7JWSRthyYX fbV96eJvAuPRPPEMpJDwjiw2sTnnsN1Ny+1HuhrDnTi0PCn8Ff5YTsNwHb+cSoGvY7V/ 0vVtuPcC0hdBhALirtnIef4PBdp548w8MX/v72XBtIFvuaCTw/H4O2EUdIiihBOeoAKU gsGwT91FR8/LNgucMcstFccQVKJqUEqMhm2zbSR3r6JvU7CuUEhRfud6SiPoXy7IyNyl rAAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; t=1756271926; s=strato-dkim-0002; d=strato.com; h=In-Reply-To:From:References:To:Subject:Date:Message-ID:Cc:Date:From: Subject:Sender; bh=HAC005ipCLf3dXAhXTHqiEpDc8/g6yDG/7PZ9jx7PGE=; b=mJd0wOTNeDp9fNbUwk9dbYrk5Glk+oqVPB4hq/W7pJ/DT7ZWKaHhwfWWHd25Jx6eGE ddPSV2CCvXTWvGcm5FoL7IBN76PvFb1fN0EiZ2tnkTD5lq87mzlm+y9cbPgmZyI1k0N4 MPiE/8ZwaXRU32W2huElU0dO4Yy3dTSQNktI9GdLNdo4CXNZnOfLbT9y8PFvvV9iKQFP 1SWqApGtPkiJ0LD8FiG1+i81yVaA1gkLCnbgYwpVnwZ6yOHGoptng1H5+fvlOl23dsNK 6RzdU25figf/WDGrg6ayVO5D5I3mZ+GhE2jI6p4aFGUeByE6kvcdQCTF91C6qd/wniVz eQvA== ARC-Authentication-Results: i=1; strato.com; arc=none; dkim=pass header.d="duedinghausen.eu" header.s="default" header.a="rsa-sha256" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1756271926; s=strato-dkim-0002; d=hoewweken.de; h=In-Reply-To:From:References:To:Subject:Date:Message-ID:Cc:Date:From: Subject:Sender; bh=HAC005ipCLf3dXAhXTHqiEpDc8/g6yDG/7PZ9jx7PGE=; b=sBq47O0wn59J1KJv2IFKh/Y6JKu0Mi7sVsdJF6PAk2NzAYksVAxUz2jg2d6g/fFoCw WzB8GpVOvgex5qOynnub1J0tOe1p5NsnYnKaeLL25zlqrO5keaI+3zAFrJXgSH5oPvuZ uKL8WrjRJIdJIjMG3Bp0R6hAt/FTWtrdGR8uncDO8LJWiVAzYYP0Lq9qc+Ht/oVzbvhv lumJGcbu+GhPsv8Lr9Y2fGdh0cW+L0M1HSpYO+ct5uyTKTJMPh1MA84xTWwuM81buZ4a bi+siyIwoJQCzSug/IMuuJOmraZQeHvtLWgNeIWvB/t4Nuw8CcHFoXhuxskH+a1hUByB cN6A== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; t=1756271926; s=strato-dkim-0003; d=hoewweken.de; h=In-Reply-To:From:References:To:Subject:Date:Message-ID:Cc:Date:From: Subject:Sender; bh=HAC005ipCLf3dXAhXTHqiEpDc8/g6yDG/7PZ9jx7PGE=; b=5oIYkRy2VeMLXrWoBIljr4PSGwrDgW+MMme4ci9a0icrvUdjss/5N0wivZhGAyJLhh +PvFqlUAMIQxIJ1pjJCA== X-RZG-AUTH: ":O2kGeEG7b/pS1EW2TmikjLDsYYueHLp2aWg0q38nsxvThUrH35SBpgREu1OVr06TMgNQOGA=" X-RZG-CLASS-ID: mo05 Received: from www.duedinghausen.eu by smtp.strato.de (RZmta 52.1.2 DYNA|AUTH) with ESMTPSA id q9bf2e17R5IkG3h (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits)) (Client did not present a certificate) for ; Wed, 27 Aug 2025 07:18:46 +0200 (CEST) Received: from [192.168.2.63] (p5dde9411.dip0.t-ipconnect.de [93.222.148.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) (Authenticated sender: steven) by www.duedinghausen.eu (Postfix) with ESMTPSA id C516114DE17 for ; Wed, 27 Aug 2025 07:18:45 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=duedinghausen.eu; s=default; t=1756271925; bh=JJLgSGVXFQqf1xchDwblA5jyqc3YCom3LmmH8hXllTQ=; h=Date:Subject:To:References:From:In-Reply-To; b=c3fsbCuvJoZNdHfLfJuEgJBjg2N38Rzg9uelijsTxqGRXtEQHFMws1OgHaryLjP0z JOB3PtWYWOMa2wi5QKlHY0setvlGDXifCNxy07aXF6rIVbW8zq+lRcOTFV/MeiU8x5 EBUsNzLAc4IlXRW4VRRAUn1pewNBCp6FZIebwYVJjOawE0f5bFKCQZHsQHUzoZIGyD Y0yRd0CsZXuKU/S1FD8pjbE+/L29jcXnfC5biqw+lGnUgmOi8wx2l5iRJ9EA44DL+U q9MxLXoBb9OIKOilmfbyRxdeDQSA9HU9CEBfGj5FPCqNr1A5AYVMhHEx2xargmHObC 0EwSiHSpEXkg/xEdjuzozCC3wWH9RzO6Mg3DoqTbUgpxvlue5R5XVm252FbKIaTbsc ORn6ekUZfnG2loFTYYCbK5YRH3KXzsTHJzEQcxRTLKz3QCvnvW+aHaMOHQ1Ji8tvNS Voro7t7Ffaec3kI1UQfh4435J0NoX4VQJNdTKWHRiWy3Tavynv0QCxyymdss4083nR 2lcZhL/eCdwjOoMqqXaOYuAFshW2NNdW4hWNSawVsh3OQTZH/px2V75sRv1VGcuIHs p+1M2dZH7uaw6a2pn5aFRegC/OCYMEE80+5JSL410TKXbnhQUAx8KcEnoOmakknfQi oB9obTaDkdzJCqZNfzaa5aks= Content-Type: multipart/alternative; boundary="------------2GAnctWoAxrslocHuDbOtxZO" Message-ID: <1117706a-6680-4f00-8728-16ae195f02ca@Duedinghausen.eu> Date: Wed, 27 Aug 2025 07:18:44 +0200 List-Id: Discussion List-Archive: https://lists.freebsd.org/archives/freebsd-virtualization List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: freebsd-virtualization@freebsd.org Sender: owner-freebsd-virtualization@FreeBSD.org MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: GPU Passthrough on FreeBSD 14.3(AMD Radeon RX 6700 XT and Debian Linux 12.11) To: virtualization@freebsd.org References: <43c96438-6068-487d-b1ea-583dddf0f6e8@ambient-md.com> Content-Language: en-US From: Stephan Althaus In-Reply-To: <43c96438-6068-487d-b1ea-583dddf0f6e8@ambient-md.com> Content-Transfer-Encoding: 8bit X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.61 / 15.00]; ARC_ALLOW(-1.00)[strato.com:s=strato-dkim-0002:i=1]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.81)[-0.811]; DMARC_POLICY_ALLOW(-0.50)[duedinghausen.eu,reject]; R_DKIM_ALLOW(-0.20)[hoewweken.de:s=strato-dkim-0002,hoewweken.de:s=strato-dkim-0003,duedinghausen.eu:s=default]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; DKIM_TRACE(0.00)[hoewweken.de:+,duedinghausen.eu:+]; RCVD_TLS_ALL(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; ASN(0.00)[asn:6724, ipnet:85.215.255.0/24, country:DE]; RCPT_COUNT_ONE(0.00)[1]; R_SPF_NA(0.00)[no SPF record]; RCVD_VIA_SMTP_AUTH(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; RCVD_COUNT_TWO(0.00)[2]; PREVIOUSLY_DELIVERED(0.00)[virtualization@freebsd.org]; TO_DN_NONE(0.00)[]; MLMMJ_DEST(0.00)[virtualization@freebsd.org]; RCVD_IN_DNSWL_NONE(0.00)[85.215.255.131:from] X-Rspamd-Queue-Id: 4cBXr90mNWz3yfd This is a multi-part message in MIME format. --------------2GAnctWoAxrslocHuDbOtxZO Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 8/27/25 05:51, Petru Garstea wrote: > > Greetings, > > I’m running a *Debian Linux 12.11 VM on FreeBSD 14.3* using *bhyve*. > Inside the VM, I’ve deployed the *Docker engine* with *Ollama > configured for ROCm support*. > > However, when executing an LLM, the *GPU fails to initialize > correctly*, causing the process to fail. > Please note on the bare metal this setup works fine. > > The full log of this behavior is included below. > > --- > >> kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000000000). >> kernel: [drm] PSP is resuming... >> kernel: [drm] reserve 0xa00000 from 0x82fd000000 for PSP TMR >> kernel: amdgpu 0000:00:01.0: amdgpu: RAS: optional ras ta ucode is >> not available >> kernel: amdgpu 0000:00:01.0: amdgpu: SECUREDISPLAY: securedisplay ta >> ucode is not available >> kernel: amdgpu 0000:00:01.0: amdgpu: SMU is resuming... >> kernel: amdgpu 0000:00:01.0: amdgpu: smu driver if version = >> 0x0000000e, smu fw if version = 0x00000012, smu fw program = 0, >> version = 0x00413900 (65.57.0) >> kernel: amdgpu 0000:00:01.0: amdgpu: SMU driver if version not matched >> kernel: amdgpu 0000:00:01.0: amdgpu: use vbios provided pptable >> kernel: amdgpu 0000:00:01.0: amdgpu: SMU is resumed successfully! >> kernel: [drm] DMUB hardware initialized: version=0x02020017 >> kernel: [drm] kiq ring mec 2 pipe 1 q 0 >> kernel: [drm] VCN decode and encode initialized successfully(under >> DPG Mode). >> kernel: [drm] JPEG decode initialized successfully. >> kernel: amdgpu 0000:00:01.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 >> on hub 0 >> kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.0.0 uses VM inv eng >> 1 on hub 0 >> kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.1.0 uses VM inv eng >> 4 on hub 0 >> kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.2.0 uses VM inv eng >> 5 on hub 0 >> kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.3.0 uses VM inv eng >> 6 on hub 0 >> kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.0.1 uses VM inv eng >> 7 on hub 0 >> kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.1.1 uses VM inv eng >> 8 on hub 0 >> kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.2.1 uses VM inv eng >> 9 on hub 0 >> kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.3.1 uses VM inv eng >> 10 on hub 0 >> kernel: amdgpu 0000:00:01.0: amdgpu: ring kiq_2.1.0 uses VM inv eng >> 11 on hub 0 >> kernel: amdgpu 0000:00:01.0: amdgpu: ring sdma0 uses VM inv eng 12 on >> hub 0 >> kernel: amdgpu 0000:00:01.0: amdgpu: ring sdma1 uses VM inv eng 13 on >> hub 0 >> kernel: amdgpu 0000:00:01.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 >> on hub 1 >> kernel: amdgpu 0000:00:01.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng >> 1 on hub 1 >> kernel: amdgpu 0000:00:01.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng >> 4 on hub 1 >> kernel: amdgpu 0000:00:01.0: amdgpu: ring jpeg_dec uses VM inv eng 5 >> on hub 1 >> kernel: amdgpu 0000:00:01.0: [drm] Cannot find any crtc or sizes >> kernel: amdgpu: qcm fence wait loop timeout expired >> kernel: amdgpu: The cp might be in an unrecoverable state due to an >> unsuccessful queues preemption >> kernel: amdgpu: Pasid 0x8002 DQM create queue type 0 failed. ret -62 >> kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset begin! >> kernel: amdgpu: Failed to suspend process 0x8002 >> kernel: amdgpu: Failed to suspend process 0x8001 >> kernel: amdgpu 0000:00:01.0: amdgpu: free PSP TMR buffer >> kernel: amdgpu 0000:00:01.0: amdgpu: MODE1 reset >> kernel: amdgpu 0000:00:01.0: amdgpu: GPU mode1 reset >> kernel: amdgpu 0000:00:01.0: amdgpu: GPU smu mode1 reset >> kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset succeeded, trying to >> resume >> kernel: clocksource: Long readout interval, skipping watchdog check: >> cs_nsec: 12622536057 wd_nsec: 12613480925 >> kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000000000). >> kernel: [drm] VRAM is lost due to GPU reset! >> kernel: [drm] PSP is resuming... >> kernel: [drm:psp_hw_start [amdgpu]] *ERROR* PSP create ring failed! >> kernel: [drm:psp_resume [amdgpu]] *ERROR* PSP resume failed >> kernel: [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume of IP >> block failed -62 >> kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset(1) failed >> kernel: amdgpu: qcm fence wait loop timeout expired >> kernel: amdgpu: The cp might be in an unrecoverable state due to an >> unsuccessful queues preemption >> kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset end with ret = -62 >> kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset begin! >> kernel: amdgpu 0000:00:01.0: amdgpu: Failed to disallow df cstate > > Regards, > Petru > Hello! Before you start docker, are you able to verify that the GPU is actually working in the vm? How did you verify ? (for AMD i don't know tho tooling) Regards, Stephan --------------2GAnctWoAxrslocHuDbOtxZO Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit
On 8/27/25 05:51, Petru Garstea wrote:

Greetings,

I’m running a Debian Linux 12.11 VM on FreeBSD 14.3 using bhyve.
Inside the VM, I’ve deployed the Docker engine with Ollama configured for ROCm support.

However, when executing an LLM, the GPU fails to initialize correctly, causing the process to fail.
Please note on the bare metal this setup works fine.

The full log of this behavior is included below.

---

kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000000000).
kernel: [drm] PSP is resuming...
kernel: [drm] reserve 0xa00000 from 0x82fd000000 for PSP TMR
kernel: amdgpu 0000:00:01.0: amdgpu: RAS: optional ras ta ucode is not available
kernel: amdgpu 0000:00:01.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
kernel: amdgpu 0000:00:01.0: amdgpu: SMU is resuming...
kernel: amdgpu 0000:00:01.0: amdgpu: smu driver if version = 0x0000000e, smu fw if version = 0x00000012, smu fw program = 0, version = 0x00413900 (65.57.0)
kernel: amdgpu 0000:00:01.0: amdgpu: SMU driver if version not matched
kernel: amdgpu 0000:00:01.0: amdgpu: use vbios provided pptable
kernel: amdgpu 0000:00:01.0: amdgpu: SMU is resumed successfully!
kernel: [drm] DMUB hardware initialized: version=0x02020017
kernel: [drm] kiq ring mec 2 pipe 1 q 0
kernel: [drm] VCN decode and encode initialized successfully(under DPG Mode).
kernel: [drm] JPEG decode initialized successfully.
kernel: amdgpu 0000:00:01.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
kernel: amdgpu 0000:00:01.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
kernel: amdgpu 0000:00:01.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
kernel: amdgpu 0000:00:01.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1
kernel: amdgpu 0000:00:01.0: [drm] Cannot find any crtc or sizes
kernel: amdgpu: qcm fence wait loop timeout expired
kernel: amdgpu: The cp might be in an unrecoverable state due to an unsuccessful queues preemption
kernel: amdgpu: Pasid 0x8002 DQM create queue type 0 failed. ret -62
kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset begin!
kernel: amdgpu: Failed to suspend process 0x8002
kernel: amdgpu: Failed to suspend process 0x8001
kernel: amdgpu 0000:00:01.0: amdgpu: free PSP TMR buffer
kernel: amdgpu 0000:00:01.0: amdgpu: MODE1 reset
kernel: amdgpu 0000:00:01.0: amdgpu: GPU mode1 reset
kernel: amdgpu 0000:00:01.0: amdgpu: GPU smu mode1 reset
kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset succeeded, trying to resume
kernel: clocksource: Long readout interval, skipping watchdog check: cs_nsec: 12622536057 wd_nsec: 12613480925
kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000000000).
kernel: [drm] VRAM is lost due to GPU reset!
kernel: [drm] PSP is resuming...
kernel: [drm:psp_hw_start [amdgpu]] *ERROR* PSP create ring failed!
kernel: [drm:psp_resume [amdgpu]] *ERROR* PSP resume failed
kernel: [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume of IP block <psp> failed -62
kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset(1) failed
kernel: amdgpu: qcm fence wait loop timeout expired
kernel: amdgpu: The cp might be in an unrecoverable state due to an unsuccessful queues preemption
kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset end with ret = -62
kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset begin!
kernel: amdgpu 0000:00:01.0: amdgpu: Failed to disallow df cstate

Regards,
Petru

Hello!

Before you start docker, are you able to verify that the GPU is actually working in the vm?

How did you verify ? (for AMD i don't know tho tooling)

Regards,
Stephan


--------------2GAnctWoAxrslocHuDbOtxZO--