Date: Fri, 28 Nov 2025 17:33:50 +0000 From: Olivier Certner <olce@FreeBSD.org> To: doc-committers@FreeBSD.org, dev-commits-doc-all@FreeBSD.org Subject: git: cbd223c45c - main - releases/15.0R/relnotes: Document AMD GPU slowness and VM domainset fixes Message-ID: <6929dcfe.b53f.620b9cc4@gitrepo.freebsd.org>
index | next in thread | raw e-mail
The branch main has been updated by olce: URL: https://cgit.FreeBSD.org/doc/commit/?id=cbd223c45cffffa3e596cfd30d3f3ba8218369a7 commit cbd223c45cffffa3e596cfd30d3f3ba8218369a7 Author: Olivier Certner <olce@FreeBSD.org> AuthorDate: 2025-11-28 14:42:30 +0000 Commit: Olivier Certner <olce@FreeBSD.org> CommitDate: 2025-11-28 15:11:01 +0000 releases/15.0R/relnotes: Document AMD GPU slowness and VM domainset fixes Replace the LinuxKPI paragraph talking only about handling the __GFP_NORETRY flag in linux_alloc_pages() with the user-facing reason of this work and an overview of other fixes towards this goal. Add a separate paragraph about VM domainset iterator fixes, which were prompted by the previous but are much larger in scope. One commit is common to both paragraphs. --- website/content/en/releases/15.0R/relnotes.adoc | 31 +++++++++++++++++++++---- 1 file changed, 27 insertions(+), 4 deletions(-) diff --git a/website/content/en/releases/15.0R/relnotes.adoc b/website/content/en/releases/15.0R/relnotes.adoc index 81957aa91e..48a4aefe51 100644 --- a/website/content/en/releases/15.0R/relnotes.adoc +++ b/website/content/en/releases/15.0R/relnotes.adoc @@ -680,10 +680,33 @@ Since these sysctls do not trigger any (de-)allocations anymore, their effect is gitref:960ee8094913[repository=src]. (Sponsored by The FreeBSD Foundation). -LinuxKPI: `linux_alloc_pages()` now honors `__GFP_NORETRY`. -This is to fix slowdowns with drm-kmod that get worse over time as physical memory become more fragmented (and probably also depending on other factors). -gitref:831e6fb0baf6[repository=src] -(Sponsored by The FreeBSD Foundation). +Gradual slowdowns and freezes experienced by owners of some AMD GPUs using the amdgpu DRM driver from the `drm-kmod` ports, starting with v5.15 (`graphics/drm-515-kmod` port), have been fixed. +In particular, owners of graphics cards with Green Sardine, Polaris 10 and 20 chips were known to be affected. +Recent Intel-based GPUs (gen 13+) may also have been affected. +The main cause is that the Linux's DRM subsystem's TTM component frequently requests memory that is physically contiguous although this property is not strictly necessary, and the kernel was trying too hard to fulfill them, leading to longer and more frequent freezes as physical memory got more fragmented over time. +In the LinuxKPI, `linux_alloc_pages()` now honors `__GFP_NORETRY` by not trying to break superpage reservations or defragment memory if the request for contiguous physical memory cannot be fulfilled immediately. +Another cause was that, during recent LinuxKPI evolution, `kmalloc()` was changed to always return physically contiguous memory as it does in Linux, but unfortunately `kvzalloc()` relied on `kmalloc()` and this was not changed, effectively turning all large memory allocations of zeroed pages into costly physically contiguous ones. +On allocation success, the TTM component sets page attributes unconditionally, regardless of whether they are already in place, which triggerred expensive TLB shootdowns even when not necessary. +Yet another cause was a flaw in the code iterating over memory domains (NUMA) leading to re-examining the same domain multiple times even if it could not fulfill the contiguous allocation request. +More details about this are given below. +Finally, some useless temporary physically contiguous allocation routinely performed in the case of Carrizo, Polaris and Vega M based AMD GPUs was converted to a regular one in the DRM drivers from the latest `drm-*-kmod` ports. +gitref:718d1928f874[repository=src], +gitref:4ca9190251bb[repository=src], +gitref:986edb19a49c[repository=src], +gitref:9d1f3ce79d85[repository=src], +gitref:da257e519bc0[repository=src] +(Sponsored by The FreeBSD Foundation.) + +Multiple flaws were fixed in the code iterating over memory domains (NUMA). +A failing contiguous allocation request would lead to re-examine the same domain multiple times even if it could not fulfill the request, wasting time and increasing allocation latency. +This would happen up to 4 times for the common case of a single memory domain and the "first touch" policy. +The first domain selected by all allocation policies, except "first touch" in some cases, would be considered even if it was not in the allowed domains mask or had been marked as to ignore in a previous attempt with the same iterator. +After a failed first attempt and sleeping, waiting allocations would restart with the policy's first domain even if that one was still in a low memory condition. +Finally, the "interleave" policy would reset the iterator index when restarting, effectively resetting the initial domain in the round-robin phase that happens after allocation from the first domain failed. +gitref:da257e519bc0[repository=src], +gitref:83ad6d8d8eee[repository=src], +gitref:b15ff7214020[repository=src] +(Sponsored by The FreeBSD Foundation.) The local stream (AF_UNIX/SOCK_STREAM) and sequenced packet stream (AF_UNIX/SOCK_SEQPACKET) sockets have been improved for better bulk transfer and round trip times. The SOCK_SEQPACKET socket has been brought to the specification and now behaves as a true stream socket, while in previous FreeBSD releases it could exhibit features ofhelp
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6929dcfe.b53f.620b9cc4>
