From nobody Sun Oct 19 17:35:37 2025 X-Original-To: dev-commits-doc-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4cqQgt17vYz6DVTD for ; Sun, 19 Oct 2025 17:35:38 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R12" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4cqQgt0Pbjz3rcd; Sun, 19 Oct 2025 17:35:38 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1760895338; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=1G28/Ky57e+B04nYMqTA38sjQyxiw2KMVu6MVdsmyYQ=; b=jTgF3pdldaYFpqUGFkXtnKFtqSaXFRmL0hhVzzAV1Z8WY0gxmJxCct6IrXxqVYg+FvQ31o l9Ivhfel6FuFakUYZNO++0FkAf/NdMWCnpXuu0td1x4dtHrs6wjUw1/Emv2wlhrWu9OXvI FhZ2S5Z66S74Qytez94NyEBZPdoFJyvJjwggTr9FDw4RCCwB7bYQtEMcyGmn0Ob09FW8S8 xqv+LBiDxSW0/h1z2qywbO3j4+9hWGpSrb3NAH8u+LvCjO5nDpWRYrukQRN2SSdl+Rpwby 3q0FOsslGchd4hLr7mwnQPGnlmIrDlw4Le9gKtTIaoC1kWZok39OgHNqULIEuw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1760895338; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=1G28/Ky57e+B04nYMqTA38sjQyxiw2KMVu6MVdsmyYQ=; b=t0/iqNzWS3TvAWtv6b/+ynem+vT1spe0ZiPAs/4cxIB1w+UbtZn6ohceQ57BCinPvFGQzT lMrZ/OGKn9pXXKBQpgjZUxA4R0t/T1aOTWU+ksa4UZW1KbHKtoTdAT47yD/yOOwp7ysAhm hSM7PCfYurys1SJt2aI97lHgTOlyjQ2iBSC0saHZC1Mo3CIyjswpNpKAAm3N3xM39HBHws XPqBon5iHRHey1ZYTzby/bEusGqzxyF8V0xcZf2EV5toG9hr5p6yak3bJWkENwp1TZLQb/ 7K1LqxMYpAJ0vjWZvR28O7dRG/6Pc9j6ufRWPhRlqx22LlfJ6MFjGcK9OgNZTA== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1760895338; a=rsa-sha256; cv=none; b=gXpangPLjeU8pNU/nRe/lWap730WWurB81WFEk4N5xSElMVYRuAyp7niUjuOT59l9aABYO vb/dZjey3vxLuG3DBQjK1OTXO3FNQo0tDZNPkPqrHatI21CsjCGgGnkGckUK35DmVf4TUf +DBv7fOFXdxe6M7fG65q76spma/W4X2t5pRAYsQD5CvxqZZwRvZ2MFJPqMTqoernSapXy8 m3Aisb14T7HxbfhJ3Fc5HFyVQTQqP+ngEQzcaid1C2CqgsmuniHqZEjhC9MV3fhAOgUmSU 4YCEyG1mIcOVkUnKTJcGJeBwb9FhobAmAF0+zoslOeNFKSZtik4UQBI4zT3bnQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4cqQgs72TVzd9j; Sun, 19 Oct 2025 17:35:37 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 59JHZb7t077512; Sun, 19 Oct 2025 17:35:37 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 59JHZb3n077509; Sun, 19 Oct 2025 17:35:37 GMT (envelope-from git) Date: Sun, 19 Oct 2025 17:35:37 GMT Message-Id: <202510191735.59JHZb3n077509@gitrepo.freebsd.org> To: doc-committers@FreeBSD.org, dev-commits-doc-all@FreeBSD.org From: Olivier Certner Subject: git: bbaab3f271 - main - Status/2025Q3/drm-drivers-slowdowns_fixes.adoc: Add report List-Id: Commit messages for all branches of the doc repository List-Archive: https://lists.freebsd.org/archives/dev-commits-doc-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-doc-all@freebsd.org Sender: owner-dev-commits-doc-all@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: olce X-Git-Repository: doc X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: bbaab3f271793f3f6bc8fd66b2f0dc2a65053300 Auto-Submitted: auto-generated The branch main has been updated by olce: URL: https://cgit.FreeBSD.org/doc/commit/?id=bbaab3f271793f3f6bc8fd66b2f0dc2a65053300 commit bbaab3f271793f3f6bc8fd66b2f0dc2a65053300 Author: Olivier Certner AuthorDate: 2025-10-18 15:00:24 +0000 Commit: Olivier Certner CommitDate: 2025-10-19 17:35:25 +0000 Status/2025Q3/drm-drivers-slowdowns_fixes.adoc: Add report Sponsored by: The FreeBSD Foundation --- .../drm-drivers-slowdowns_fixes.adoc | 40 ++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/website/content/en/status/report-2025-07-2025-09/drm-drivers-slowdowns_fixes.adoc b/website/content/en/status/report-2025-07-2025-09/drm-drivers-slowdowns_fixes.adoc new file mode 100644 index 0000000000..42bc045d9e --- /dev/null +++ b/website/content/en/status/report-2025-07-2025-09/drm-drivers-slowdowns_fixes.adoc @@ -0,0 +1,40 @@ +=== DRM Drivers Slowdowns and Freezes Fixes + +Links: + +link:https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=277476[Main PR] URL: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=277476 + +link:https://github.com/freebsd/drm-kmod/issues/302[drm-kmod GitHub issue] URL: https://github.com/freebsd/drm-kmod/issues/302 + +Contact: Olivier Certner + +Owners of AMD GPUs using the amdgpu DRM driver from the `drm-kmod` ports, especially starting with v5.15 (`drm-515-kmod`), have been experiencing gradual slowdowns and freezes since at least May 2024. +Code analysis suggests that recent Intel-based GPUs (gen 13+) may also be affected. +We are pleased to announce that, to the best of our knowledge, all these problems have been fixed. + +We encourage people to test the latest FreeBSD code on branches `main`, `stable/15` or `stable/14`. +The fixes will be included in the upcoming 15.0 and 14.4 releases. +Errata notices and patches may be issued for 14.3 in order for people not to have to wait until 14.4 (whose release should tentatively happen next March). +An additional fix will find its way in the `drm-kmod` ports (see below). + +Investigations revealed that the crux of all these problems has been bad handling of too frequent, and generally not really necessary, physically contiguous memory allocation requests in fast paths. +Basically, the DRM's TTM component tries to allocate pools of graphics memory pages that are as much as possible physically contiguous in order to reduce the number of corresponding TLB entries. +It does it in a loop that first tries to allocate pages of higher order with the `__GFP_NORETRY` flag, gradually falling back to smallest ones (see `ttm_pool_alloc()`). + +The first problem is that our LinuxKPI did not handle Linux's `__GFP_NORETRY` flag and would try hard to fulfill the first requests, i.e., those with highest order pages, using expensive mechanisms to obtain or produce contiguous memory if not readily available. +A first fix by Mathieu (`sigsys` at `gmail` with regular company suffix) removed memory compaction from this process (foregoing calls to `vm_page_reclaim_contig()`). +This fix was then completed by stopping the VM system from trying to break memory reservations, which are pieces of a speculative mechanism that tries to automatically provoke the use of superpages. + +Another problem came from evolutions of our LinuxKPI. +In order to better comply with what Linux does, `kmalloc()` was changed to always return physically contiguous memory. +Unfortunately, `kvzalloc()`, which relied on `kmalloc()` in our implementation (which was conceptually wrong, but initially harmless in practice), was not switched to rely on `kvmalloc()` in the process, effectively turning large memory allocations of zeroed pages into costly physically contiguous ones. + +Some rough profiling of slowdowns was done using `dtrace`. +It revealed that a fair amount of execution time of the failing allocations came from attempting multiple allocation on the same NUMA domain, and that of succeeding ones came from useless changes to page attributes, triggering expensive TLB shootdowns. +An analysis of the VM domainset iterators code revealed multiple flaws, in particular leading to re-examining the same domain multiple times (up to 4 times for the common case of machines with a single domain) without any additional guarantees of success for new attemps. +Some other VM domainset problems have been fixed in the process, such as ensuring that allocation requests prefer domains not on a low memory condition in all situations. + +Finally, concerning specifically the amdgpu driver and affecting only Carrizo, Polaris and Vega M based AMD GPUs, a temporary allocation that was unnecessarily physically contiguous was replaced with a regular one, making the remaining, relatively short but noticeable freezes disappear. +By contrast with those evoked above, this change is to the `drm-kmod` ports' code, and is to be included at the ports' next version bump in the ports tree (expected ports versions: `5.10.163_9`, `5.15.160_6`, `6.1.128_6` and `6.6.25_7` respectively for `drm-510-kmod`, `drm-515-kmod`, `drm-61-kmod` and `drm-66-kmod`). + +This work was sponsored by the FreeBSD Foundation as part of the Laptop Project. + +Sponsor: The FreeBSD Foundation