From nobody Fri Nov 28 17:33:50 2025 X-Original-To: dev-commits-doc-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4dJ0lL6zKSz6HvkK for ; Fri, 28 Nov 2025 17:33:50 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R12" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4dJ0lL6M2Sz3Dx4 for ; Fri, 28 Nov 2025 17:33:50 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1764351230; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=sy0BDGeNWmUU6QCLvuytyh7xhqrlLbANaTnYQCKfoSU=; b=CVIvWSlyvCXNQUWQzgelZbaN11mgYsHMETIHRo3KL2LtB4qJUCSzwaNutt3edg15s0DUcA W36Kz7AZ3LM0Ih/YBSnS6jcbmwS9RBBG8yCDrXHZ8Piuph+2YfmUBbkYiIX7zptMWjtIwW bbv615u7C9aroUvuprTmhTBPHJ93EN131o6PfzqAyfb754a+fSQ87yMoIRVb8q/FoOuaSj KGyUmgIx+XHCj4iUzrTrdGSqh1hlkiBhNhk4MfxP8NMiav5dit3+lJK63qy3lg2GDDBf+S 4CocQU/xBvPw0h6F8qo6Z5f8gDro0Go75haHX309VTjHbEoX/FZKsZVbpsplaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1764351230; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=sy0BDGeNWmUU6QCLvuytyh7xhqrlLbANaTnYQCKfoSU=; b=kbdnPzN1zIk3kwSWz9HCZmCH47y4DknJwvd6VD/kcd1DVzrpxeDw1cYO9iv4xExFPKGt5W t51NVQk55OzKZFNFnxRutdoIQ8I7vH3nJXpNq6DTa3wR4MOGjh/DO0qTimJidxT1HTYHRX X80p/bZwkQkCHRn098irfEoRZ1tRbo9040DX5uPgfwLdn0kTb9yfg7ibascTYl7PuDHW4/ hYMRdbkSNIEFQXYeZpwUiznrG9phEr00X0nnA7Ua8/wkc9dBTKNq0701x7p2rO6nh+uT6h C5BXMlF2pBzgs0Vcy9kmuPCRno9XzioOJAKgcv3iUa6jLuEATHT4jBHqNaFisw== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1764351230; a=rsa-sha256; cv=none; b=HzSVWLxUGKXafkn/p2fOZOSbJgEO8Jqy8I53NRxgpVXDOY8EoGX9MZfblNXFvigJUex22t xdVIaE088Nw0bjg2hr/Y9L2nD987cHOeAmHY1WHU1pGuzzehu+ki5BMUhsmU2suCiVq+iL CUzIEIP+2KzEb266j2ifpHyUONLxETSeez/qUYVHLfo1v8O4CiSF0tEE4Na4z9PuTmTjnI twZLdQYFMsfO8gMpFO75LoTyXw9YV95v/qtusRFzQQdBo5BrCGMelqef+9jyMaijgqFvhF aqrR3HCmz2WPatvNzb2LEX/8Zy3LFVgZ0ZouCDGRrqeQE89t08/L47ogV0PDjw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) by mxrelay.nyi.freebsd.org (Postfix) with ESMTP id 4dJ0lL5glpz19VB for ; Fri, 28 Nov 2025 17:33:50 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from git (uid 1279) (envelope-from git@FreeBSD.org) id b53f by gitrepo.freebsd.org (DragonFly Mail Agent v0.13+ on gitrepo.freebsd.org); Fri, 28 Nov 2025 17:33:50 +0000 To: doc-committers@FreeBSD.org, dev-commits-doc-all@FreeBSD.org From: Olivier Certner Subject: git: cbd223c45c - main - releases/15.0R/relnotes: Document AMD GPU slowness and VM domainset fixes List-Id: Commit messages for all branches of the doc repository List-Archive: https://lists.freebsd.org/archives/dev-commits-doc-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-doc-all@freebsd.org Sender: owner-dev-commits-doc-all@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: olce X-Git-Repository: doc X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: cbd223c45cffffa3e596cfd30d3f3ba8218369a7 Auto-Submitted: auto-generated Date: Fri, 28 Nov 2025 17:33:50 +0000 Message-Id: <6929dcfe.b53f.620b9cc4@gitrepo.freebsd.org> The branch main has been updated by olce: URL: https://cgit.FreeBSD.org/doc/commit/?id=cbd223c45cffffa3e596cfd30d3f3ba8218369a7 commit cbd223c45cffffa3e596cfd30d3f3ba8218369a7 Author: Olivier Certner AuthorDate: 2025-11-28 14:42:30 +0000 Commit: Olivier Certner CommitDate: 2025-11-28 15:11:01 +0000 releases/15.0R/relnotes: Document AMD GPU slowness and VM domainset fixes Replace the LinuxKPI paragraph talking only about handling the __GFP_NORETRY flag in linux_alloc_pages() with the user-facing reason of this work and an overview of other fixes towards this goal. Add a separate paragraph about VM domainset iterator fixes, which were prompted by the previous but are much larger in scope. One commit is common to both paragraphs. --- website/content/en/releases/15.0R/relnotes.adoc | 31 +++++++++++++++++++++---- 1 file changed, 27 insertions(+), 4 deletions(-) diff --git a/website/content/en/releases/15.0R/relnotes.adoc b/website/content/en/releases/15.0R/relnotes.adoc index 81957aa91e..48a4aefe51 100644 --- a/website/content/en/releases/15.0R/relnotes.adoc +++ b/website/content/en/releases/15.0R/relnotes.adoc @@ -680,10 +680,33 @@ Since these sysctls do not trigger any (de-)allocations anymore, their effect is gitref:960ee8094913[repository=src]. (Sponsored by The FreeBSD Foundation). -LinuxKPI: `linux_alloc_pages()` now honors `__GFP_NORETRY`. -This is to fix slowdowns with drm-kmod that get worse over time as physical memory become more fragmented (and probably also depending on other factors). -gitref:831e6fb0baf6[repository=src] -(Sponsored by The FreeBSD Foundation). +Gradual slowdowns and freezes experienced by owners of some AMD GPUs using the amdgpu DRM driver from the `drm-kmod` ports, starting with v5.15 (`graphics/drm-515-kmod` port), have been fixed. +In particular, owners of graphics cards with Green Sardine, Polaris 10 and 20 chips were known to be affected. +Recent Intel-based GPUs (gen 13+) may also have been affected. +The main cause is that the Linux's DRM subsystem's TTM component frequently requests memory that is physically contiguous although this property is not strictly necessary, and the kernel was trying too hard to fulfill them, leading to longer and more frequent freezes as physical memory got more fragmented over time. +In the LinuxKPI, `linux_alloc_pages()` now honors `__GFP_NORETRY` by not trying to break superpage reservations or defragment memory if the request for contiguous physical memory cannot be fulfilled immediately. +Another cause was that, during recent LinuxKPI evolution, `kmalloc()` was changed to always return physically contiguous memory as it does in Linux, but unfortunately `kvzalloc()` relied on `kmalloc()` and this was not changed, effectively turning all large memory allocations of zeroed pages into costly physically contiguous ones. +On allocation success, the TTM component sets page attributes unconditionally, regardless of whether they are already in place, which triggerred expensive TLB shootdowns even when not necessary. +Yet another cause was a flaw in the code iterating over memory domains (NUMA) leading to re-examining the same domain multiple times even if it could not fulfill the contiguous allocation request. +More details about this are given below. +Finally, some useless temporary physically contiguous allocation routinely performed in the case of Carrizo, Polaris and Vega M based AMD GPUs was converted to a regular one in the DRM drivers from the latest `drm-*-kmod` ports. +gitref:718d1928f874[repository=src], +gitref:4ca9190251bb[repository=src], +gitref:986edb19a49c[repository=src], +gitref:9d1f3ce79d85[repository=src], +gitref:da257e519bc0[repository=src] +(Sponsored by The FreeBSD Foundation.) + +Multiple flaws were fixed in the code iterating over memory domains (NUMA). +A failing contiguous allocation request would lead to re-examine the same domain multiple times even if it could not fulfill the request, wasting time and increasing allocation latency. +This would happen up to 4 times for the common case of a single memory domain and the "first touch" policy. +The first domain selected by all allocation policies, except "first touch" in some cases, would be considered even if it was not in the allowed domains mask or had been marked as to ignore in a previous attempt with the same iterator. +After a failed first attempt and sleeping, waiting allocations would restart with the policy's first domain even if that one was still in a low memory condition. +Finally, the "interleave" policy would reset the iterator index when restarting, effectively resetting the initial domain in the round-robin phase that happens after allocation from the first domain failed. +gitref:da257e519bc0[repository=src], +gitref:83ad6d8d8eee[repository=src], +gitref:b15ff7214020[repository=src] +(Sponsored by The FreeBSD Foundation.) The local stream (AF_UNIX/SOCK_STREAM) and sequenced packet stream (AF_UNIX/SOCK_SEQPACKET) sockets have been improved for better bulk transfer and round trip times. The SOCK_SEQPACKET socket has been brought to the specification and now behaves as a true stream socket, while in previous FreeBSD releases it could exhibit features of