From nobody Thu Oct  9 11:54:12 2025
X-Original-To: freebsd-questions@mlmmj.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
	by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4cj7Zd29zhz6BnQG
	for <freebsd-questions@mlmmj.nyi.freebsd.org>; Thu, 09 Oct 2025 11:54:17 +0000 (UTC)
	(envelope-from nbe@vkf-renzel.de)
Received: from mx1.renzel.net (mx1.renzel.net [195.243.213.156])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (4096 bits)
	 client-signature RSA-PSS (2048 bits))
	(Client CN "*.renzel.net", Issuer "DigiCert Global G2 TLS RSA SHA256 2020 CA1" (verified OK))
	by mx1.freebsd.org (Postfix) with ESMTPS id 4cj7ZZ6KG1z3Msr
	for <freebsd-questions@freebsd.org>; Thu, 09 Oct 2025 11:54:14 +0000 (UTC)
	(envelope-from nbe@vkf-renzel.de)
Authentication-Results: mx1.freebsd.org;
	dkim=none;
	dmarc=pass (policy=none) header.from=vkf-renzel.de;
	spf=pass (mx1.freebsd.org: domain of nbe@vkf-renzel.de designates 195.243.213.156 as permitted sender) smtp.mailfrom=nbe@vkf-renzel.de
Message-ID: <5a46e354-f38a-4c2d-9d20-ef5d76e3f7be@vkf-renzel.de>
X-Virus-Status: Clean
X-Virus-Scanned: clamav-milter 1.4.2 at clamav-milter.renzel.net
Date: Thu, 9 Oct 2025 13:54:12 +0200
List-Id: User questions <freebsd-questions.freebsd.org>
List-Archive: https://lists.freebsd.org/archives/freebsd-questions
List-Help: <mailto:questions+help@freebsd.org>
List-Post: <mailto:questions@freebsd.org>
List-Subscribe: <mailto:questions+subscribe@freebsd.org>
List-Unsubscribe: <mailto:questions+unsubscribe@freebsd.org>
X-BeenThere: freebsd-questions@freebsd.org
Sender: owner-freebsd-questions@FreeBSD.org
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Content-Language: en-US
To: freebsd-questions@freebsd.org
From: Nils Beyer <nbe@vkf-renzel.de>
Subject: AMD GPU locks up using "koboldcpp" or "llama.cpp"...
Organization: VKF Renzel GmbH
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-1005.1 required=10.0 tests=ALL_TRUSTED=-1000,
	BAYES_00=-5.1 autolearn=ham autolearn_force=yes (ALL_TRUSTED)
	version=4.0.1
X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-26) on
	clamav-milter.renzel.net
X-Spamd-Bar: ---
X-Spamd-Result: default: False [-3.64 / 15.00];
	NEURAL_HAM_SHORT(-0.95)[-0.948];
	NEURAL_HAM_LONG(-0.94)[-0.944];
	NEURAL_HAM_MEDIUM(-0.84)[-0.844];
	DMARC_POLICY_ALLOW(-0.50)[vkf-renzel.de,none];
	R_SPF_ALLOW(-0.20)[+mx];
	RWL_MAILSPIKE_VERYGOOD(-0.20)[195.243.213.156:from];
	MIME_GOOD(-0.10)[text/plain];
	ONCE_RECEIVED(0.10)[];
	R_DKIM_NA(0.00)[];
	ASN(0.00)[asn:3320, ipnet:195.243.0.0/16, country:DE];
	RCPT_COUNT_ONE(0.00)[1];
	HAS_ORG_HEADER(0.00)[];
	MIME_TRACE(0.00)[0:+];
	MLMMJ_DEST(0.00)[freebsd-questions@freebsd.org];
	FROM_HAS_DN(0.00)[];
	FROM_EQ_ENVFROM(0.00)[];
	MID_RHS_MATCH_FROM(0.00)[];
	TO_MATCH_ENVRCPT_ALL(0.00)[];
	TO_DN_NONE(0.00)[];
	RCVD_COUNT_ZERO(0.00)[0];
	ARC_NA(0.00)[]
X-Rspamd-Queue-Id: 4cj7ZZ6KG1z3Msr

Hi,

I have opened a bug report here:

	https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=289813

Just to get a few more pointers, I'd like to ask you whether you are successfully
able to inference with "koboldcpp" and "llama.cpp" using an AMD GPU without
lock-ups?


To try quickly, you can checkout/build and bench quickly:


as root:
--------
pkg install gmake vulkan-loader opencl mesa-devel python

(attention: this installs 'mesa-devel' and remaps your current libGL and such. After
testing I suggest to remove 'mesa-devel' again as it gave me problems under Plasma6)


as user:
--------
vulkaninfo

(looks good?)

clinfo

(looks good, too?)


mkdir -p ~/work/src
cd ~/work/src
fetch -o MN-12B-Mag-Mell-R1.IQ4_XS.gguf 'https://huggingface.co/mradermacher/MN-12B-Mag-Mell-R1-GGUF/resolve/main/MN-12B-Mag-Mell-R1.IQ4_XS.gguf?download=true'


# koboldCpp
cd ~/work/src
git clone --depth 1 https://github.com/LostRuins/koboldcpp
cd koboldcpp
gmake -j16 LLAMA_CLBLAST=1 LLAMA_OPENBLAS=1 LLAMA_VULKAN=1 LDFLAGS="-L/usr/local/lib"

python koboldcpp.py --usevulkan --gpulayers 999 --benchmark --model ../MN-12B-Mag-Mell-R1.IQ4_XS.gguf

(do it a few times, your GPU may eventually lock up)


# llama.cpp
cd ~/work/src
git clone --depth 1 https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B .build -DGGML_VULKAN=1 -DGGML_OPENCL=1
cmake --build .build --parallel 16

.build/bin/llama-bench -m ../MN-12B-Mag-Mell-R1.IQ4_XS.gguf -ngl 100 -fa 0,1

(do it a few times, your GPU may eventually lock up)


Thanks for trying and for your feedbacks...


Regards,
Nils