From nobody Wed Aug 27 03:51:22 2025 X-Original-To: virtualization@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4cBVvR2j0mz65cRM for ; Wed, 27 Aug 2025 03:51:31 +0000 (UTC) (envelope-from peter.garshtja@ambient-md.com) Received: from mail-qv1-xf2f.google.com (mail-qv1-xf2f.google.com [IPv6:2607:f8b0:4864:20::f2f]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4cBVvQ1x8kz3sJk for ; Wed, 27 Aug 2025 03:51:30 +0000 (UTC) (envelope-from peter.garshtja@ambient-md.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=ambient-md-com.20230601.gappssmtp.com header.s=20230601 header.b=b9+VZg42; dmarc=none; spf=none (mx1.freebsd.org: domain of peter.garshtja@ambient-md.com has no SPF policy when checking 2607:f8b0:4864:20::f2f) smtp.mailfrom=peter.garshtja@ambient-md.com Received: by mail-qv1-xf2f.google.com with SMTP id 6a1803df08f44-70a9f5625b7so57329966d6.2 for ; Tue, 26 Aug 2025 20:51:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ambient-md-com.20230601.gappssmtp.com; s=20230601; t=1756266684; x=1756871484; darn=freebsd.org; h=to:subject:from:content-language:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=CGxwCQJWqFJr94ehU/YYt/W8izMwAo6aq3uWTrcJ2Ts=; b=b9+VZg428Tn3lzTPs9g6HDgPukR1HFZhrbREpW8souQemRxJpobim6zL1+k26YKWtX leFoR3zu2vTlyNU4liSwSamp7c+WtBMbrugzkXzymr3wWZDGykZuE8omhoMpa8WfCnC2 RyAzt4OybGbZ9MWw3M5PooAaBUw63nL95MKxycllYhZ1Re+gPQaz2uq1TAMaQaHCVaC8 ZhuQl9qcFBfTYV3CmQaIVh6kBFWlVF0QzQBqhGGTebBuMYH2CkyUpPYCTOg7bphZJJtF ZAEkliL/Fs7Veke95QhffAW/a17yaY4kCL7+8dpIj1UJ/sx0e2Y4QI2wAD7ojfzfD9kT 08Gg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1756266684; x=1756871484; h=to:subject:from:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=CGxwCQJWqFJr94ehU/YYt/W8izMwAo6aq3uWTrcJ2Ts=; b=g7wAbQBZtdcNBnQy0Bf0cMEza6W6vSMWC171mDxyhb8hrWuDjMJavzOFYGvnDEmN85 hqd2sDpAG036rI7rFJyfCcJeqBVj62qDEEkhOUq9bXxiffFY07tbOsDeQp+mOXzb7nmW w1SnL3ocnWo9o2zx/khUmkVhbJ6XtWaQrjfnuCYEh6PKgVOVLKMVw6mgzS/NXdfgKdWJ /pt4CFItzZzvqWvKLW1+6CEenjegSgN3gKSUZvm87EfbN3z3wcPXirSLCIV5E2H+Bjg3 mjwrLc0d/U4VQYsbIFO4nBc+cLzopYKiHo1Mbp5YAtowp0W5SdBbdvn2pjnmRaakcGCZ qEnA== X-Gm-Message-State: AOJu0YytMkYNXr+5C9iXdVgWg+XpUGd0BP7Ef6Xvi5QNKKtPoo2Ub1R6 LFWrESVIKCIWciF9p3O/wX9EcTg6+9/3RPqY1vV6HZPN3068f0XlOmyw+/CvgDOe0W5ljPfvjfg k6kiy8c0= X-Gm-Gg: ASbGncswFGA4dB7i8NjOCHgbBDIb00zltvHimKFvohE8uSFX6uCXWIhgNCFWJKY5gt9 u6NJBJ8ONuPD/k89ftouSFhVI2n/I/Q07XVBnAdViWeLmeJFfJuD5FxU0IoHUwhzyBvhCxKekce Hf/36n7+ZeuoFD3HUHjc15OqfwtEQ9der+PhHGMf9x3O6K7ifLNISoNNb+tc9sszZA5aW7BhFWg VOz3q/Z25BTCmWYb87ARXa6lm+GP4W7IGIRbJnS1L2/Jp5ncwO/2fHn/vX8VznTHN+MuzJb8qUb 5tNo+M1fxIo0ZpDejMNiwtxjwBk3qSoXK4RHtlxFOmyCyrdZptA9sJ2k82GuXrId6BLBVsDoa1R TmtMHjjLdFqz2w+kW3bZJG9p5MxPT0LuissBK8CffojjsIx9YPzDP0wFQ3sBTXL6xbEQwKDJzqV F3p0Wbd6fmGiLxHyaNl81VYd/8doRnfTtIkZA= X-Google-Smtp-Source: AGHT+IHV8XulkLRNKt8iZbDrWZgnnfok1B9IbH/97gq2ghrigaaZMd1/RO25FWJcm+lN6irXzfm3YQ== X-Received: by 2002:ad4:5cad:0:b0:70d:7cb2:3993 with SMTP id 6a1803df08f44-70d97112a80mr202248116d6.28.1756266683778; Tue, 26 Aug 2025 20:51:23 -0700 (PDT) Received: from [172.26.26.145] (bras-base-lprrpq1404w-grc-15-174-92-209-34.dsl.bell.ca. [174.92.209.34]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-70da728a11asm76854446d6.46.2025.08.26.20.51.23 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 26 Aug 2025 20:51:23 -0700 (PDT) Content-Type: multipart/alternative; boundary="------------fNgGG00Fd1icAbLeSVN1gK3v" Message-ID: <43c96438-6068-487d-b1ea-583dddf0f6e8@ambient-md.com> Date: Tue, 26 Aug 2025 23:51:22 -0400 List-Id: Discussion List-Archive: https://lists.freebsd.org/archives/freebsd-virtualization List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: freebsd-virtualization@freebsd.org Sender: owner-freebsd-virtualization@FreeBSD.org MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US From: Petru Garstea Subject: GPU Passthrough on FreeBSD 14.3(AMD Radeon RX 6700 XT and Debian Linux 12.11) To: virtualization@freebsd.org X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.30 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[ambient-md-com.20230601.gappssmtp.com:s=20230601]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; R_SPF_NA(0.00)[no SPF record]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; DMARC_NA(0.00)[ambient-md.com]; ARC_NA(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; MID_RHS_MATCH_FROM(0.00)[]; MLMMJ_DEST(0.00)[virtualization@freebsd.org]; RCVD_TLS_LAST(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::f2f:from]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; TO_DN_NONE(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[virtualization@freebsd.org]; DKIM_TRACE(0.00)[ambient-md-com.20230601.gappssmtp.com:+] X-Rspamd-Queue-Id: 4cBVvQ1x8kz3sJk This is a multi-part message in MIME format. --------------fNgGG00Fd1icAbLeSVN1gK3v Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Greetings, I’m running a *Debian Linux 12.11 VM on FreeBSD 14.3* using *bhyve*. Inside the VM, I’ve deployed the *Docker engine* with *Ollama configured for ROCm support*. However, when executing an LLM, the *GPU fails to initialize correctly*, causing the process to fail. Please note on the bare metal this setup works fine. The full log of this behavior is included below. --- > kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000000000). > kernel: [drm] PSP is resuming... > kernel: [drm] reserve 0xa00000 from 0x82fd000000 for PSP TMR > kernel: amdgpu 0000:00:01.0: amdgpu: RAS: optional ras ta ucode is not > available > kernel: amdgpu 0000:00:01.0: amdgpu: SECUREDISPLAY: securedisplay ta > ucode is not available > kernel: amdgpu 0000:00:01.0: amdgpu: SMU is resuming... > kernel: amdgpu 0000:00:01.0: amdgpu: smu driver if version = > 0x0000000e, smu fw if version = 0x00000012, smu fw program = 0, > version = 0x00413900 (65.57.0) > kernel: amdgpu 0000:00:01.0: amdgpu: SMU driver if version not matched > kernel: amdgpu 0000:00:01.0: amdgpu: use vbios provided pptable > kernel: amdgpu 0000:00:01.0: amdgpu: SMU is resumed successfully! > kernel: [drm] DMUB hardware initialized: version=0x02020017 > kernel: [drm] kiq ring mec 2 pipe 1 q 0 > kernel: [drm] VCN decode and encode initialized successfully(under DPG > Mode). > kernel: [drm] JPEG decode initialized successfully. > kernel: amdgpu 0000:00:01.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 > on hub 0 > kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 > on hub 0 > kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 > on hub 0 > kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 > on hub 0 > kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 > on hub 0 > kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 > on hub 0 > kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 > on hub 0 > kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 > on hub 0 > kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.3.1 uses VM inv eng > 10 on hub 0 > kernel: amdgpu 0000:00:01.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 > on hub 0 > kernel: amdgpu 0000:00:01.0: amdgpu: ring sdma0 uses VM inv eng 12 on > hub 0 > kernel: amdgpu 0000:00:01.0: amdgpu: ring sdma1 uses VM inv eng 13 on > hub 0 > kernel: amdgpu 0000:00:01.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 > on hub 1 > kernel: amdgpu 0000:00:01.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng > 1 on hub 1 > kernel: amdgpu 0000:00:01.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng > 4 on hub 1 > kernel: amdgpu 0000:00:01.0: amdgpu: ring jpeg_dec uses VM inv eng 5 > on hub 1 > kernel: amdgpu 0000:00:01.0: [drm] Cannot find any crtc or sizes > kernel: amdgpu: qcm fence wait loop timeout expired > kernel: amdgpu: The cp might be in an unrecoverable state due to an > unsuccessful queues preemption > kernel: amdgpu: Pasid 0x8002 DQM create queue type 0 failed. ret -62 > kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset begin! > kernel: amdgpu: Failed to suspend process 0x8002 > kernel: amdgpu: Failed to suspend process 0x8001 > kernel: amdgpu 0000:00:01.0: amdgpu: free PSP TMR buffer > kernel: amdgpu 0000:00:01.0: amdgpu: MODE1 reset > kernel: amdgpu 0000:00:01.0: amdgpu: GPU mode1 reset > kernel: amdgpu 0000:00:01.0: amdgpu: GPU smu mode1 reset > kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset succeeded, trying to resume > kernel: clocksource: Long readout interval, skipping watchdog check: > cs_nsec: 12622536057 wd_nsec: 12613480925 > kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000000000). > kernel: [drm] VRAM is lost due to GPU reset! > kernel: [drm] PSP is resuming... > kernel: [drm:psp_hw_start [amdgpu]] *ERROR* PSP create ring failed! > kernel: [drm:psp_resume [amdgpu]] *ERROR* PSP resume failed > kernel: [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume of IP > block failed -62 > kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset(1) failed > kernel: amdgpu: qcm fence wait loop timeout expired > kernel: amdgpu: The cp might be in an unrecoverable state due to an > unsuccessful queues preemption > kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset end with ret = -62 > kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset begin! > kernel: amdgpu 0000:00:01.0: amdgpu: Failed to disallow df cstate Regards, Petru --------------fNgGG00Fd1icAbLeSVN1gK3v Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit

Greetings,

I’m running a Debian Linux 12.11 VM on FreeBSD 14.3 using bhyve.
Inside the VM, I’ve deployed the Docker engine with Ollama configured for ROCm support.

However, when executing an LLM, the GPU fails to initialize correctly, causing the process to fail.
Please note on the bare metal this setup works fine.

The full log of this behavior is included below.

---

kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000000000).
kernel: [drm] PSP is resuming...
kernel: [drm] reserve 0xa00000 from 0x82fd000000 for PSP TMR
kernel: amdgpu 0000:00:01.0: amdgpu: RAS: optional ras ta ucode is not available
kernel: amdgpu 0000:00:01.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
kernel: amdgpu 0000:00:01.0: amdgpu: SMU is resuming...
kernel: amdgpu 0000:00:01.0: amdgpu: smu driver if version = 0x0000000e, smu fw if version = 0x00000012, smu fw program = 0, version = 0x00413900 (65.57.0)
kernel: amdgpu 0000:00:01.0: amdgpu: SMU driver if version not matched
kernel: amdgpu 0000:00:01.0: amdgpu: use vbios provided pptable
kernel: amdgpu 0000:00:01.0: amdgpu: SMU is resumed successfully!
kernel: [drm] DMUB hardware initialized: version=0x02020017
kernel: [drm] kiq ring mec 2 pipe 1 q 0
kernel: [drm] VCN decode and encode initialized successfully(under DPG Mode).
kernel: [drm] JPEG decode initialized successfully.
kernel: amdgpu 0000:00:01.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
kernel: amdgpu 0000:00:01.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
kernel: amdgpu 0000:00:01.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
kernel: amdgpu 0000:00:01.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1
kernel: amdgpu 0000:00:01.0: [drm] Cannot find any crtc or sizes
kernel: amdgpu: qcm fence wait loop timeout expired
kernel: amdgpu: The cp might be in an unrecoverable state due to an unsuccessful queues preemption
kernel: amdgpu: Pasid 0x8002 DQM create queue type 0 failed. ret -62
kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset begin!
kernel: amdgpu: Failed to suspend process 0x8002
kernel: amdgpu: Failed to suspend process 0x8001
kernel: amdgpu 0000:00:01.0: amdgpu: free PSP TMR buffer
kernel: amdgpu 0000:00:01.0: amdgpu: MODE1 reset
kernel: amdgpu 0000:00:01.0: amdgpu: GPU mode1 reset
kernel: amdgpu 0000:00:01.0: amdgpu: GPU smu mode1 reset
kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset succeeded, trying to resume
kernel: clocksource: Long readout interval, skipping watchdog check: cs_nsec: 12622536057 wd_nsec: 12613480925
kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000000000).
kernel: [drm] VRAM is lost due to GPU reset!
kernel: [drm] PSP is resuming...
kernel: [drm:psp_hw_start [amdgpu]] *ERROR* PSP create ring failed!
kernel: [drm:psp_resume [amdgpu]] *ERROR* PSP resume failed
kernel: [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume of IP block <psp> failed -62
kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset(1) failed
kernel: amdgpu: qcm fence wait loop timeout expired
kernel: amdgpu: The cp might be in an unrecoverable state due to an unsuccessful queues preemption
kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset end with ret = -62
kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset begin!
kernel: amdgpu 0000:00:01.0: amdgpu: Failed to disallow df cstate

Regards,
Petru

--------------fNgGG00Fd1icAbLeSVN1gK3v--