From nobody Tue Dec 5 18:27:37 2023 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Sl8CY5dYGz53JVL; Tue, 5 Dec 2023 18:27:37 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Sl8CY57Pxz3ZLp; Tue, 5 Dec 2023 18:27:37 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1701800857; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=C8brK5tSS0DDuJZMkcyuSZTxto0y6G2h5BBsZUus9Yg=; b=ToL+FOqGywLz2GmnldCoZNEB1xu9j5ej0vq0pty2T5GUZ6JvB7cWvt0wzylgQWPApAnldD r12rMFm5YgMHrw16wJ5qmJgAtAtUVGayVe18drB0cHxISdcSnumizqHzK6CyAN30+Pj9aW TvsGoqxC1v6DPyDcJUNoIx+hJpRoxwgyBKimJdkZS+WuwCPZcCuRH5nqLK+NnyuNAdMi8F mM1deiXrMnPUdq0+g0ehB6ECKUUuFUiX6R4bAc1uPRQyko7D9JDu+Wz2zruBar5PochTSm 3CxoKK7YA6wDhjLEGjVaKVu/oWQK34PzGOaVm/1FwZ8JecPlydhdllDtH4NEKw== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1701800857; a=rsa-sha256; cv=none; b=CMRLTx8ZVvZcDP30kjdeEC1MXeXx9bSVC0NEvvia+cMpVeE/dCVBJeSxm6UxtF3ZNqON0a q6tUQmmEgvbJ8odTyf0/0OiFwQzLCXaZ53iJ1Cb2uHSAypFXZTo9HKP3GL4U1kHsDnpk/r rvcJhgeBJ0a/NXIT5ugLsT8AI0DjqF0nDQrPLAFo699JSJxuXcHI8WRtNtWi30IO1aRWu6 4mcZP8kPX1WatZwztPrcIFY+Tw2/n+VpUx2dRPVvckfjwqyBhCOQ2nCcqjgol6H3TQGnqe +kUuMYTPBD6eco/CyVm+TR4rc8etqELTqjxQsF/tF6thiDT/dtQg/8g2hfyy7g== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1701800857; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=C8brK5tSS0DDuJZMkcyuSZTxto0y6G2h5BBsZUus9Yg=; b=AI0PUVnbFdanZAal+PGrHYwe+8JAuOidetyJWlCJRZQABBTEWJJewdvcd8yiFx+vU6UHTB yVwm7CukcoPnmOdQgxGnhpY6spLWVGeCIdzhRc+ebkkHbJnU/81r0Btwu6cR2xXq1P3Mrh px4dCyAWqm24M4X8wZ4oP15llCwDHjLppqHYYALy5f9VhjcrTdBIPPYG+dBy1T1ARA6Fji iSVTmjFFmFn4/zwthgtCFOf7RWZrZLwLcz7gtG45/xNyNGSnn+GNoFMm5Xsb7r7om6KIlX KONndjWKYuiSr1wEazkIuPBNS0jiMwDtgpdyGQ4dhi9uirXqN6uk+bJoZESgog== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4Sl8CY3v1pz59x; Tue, 5 Dec 2023 18:27:37 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.17.1/8.17.1) with ESMTP id 3B5IRbQr009398; Tue, 5 Dec 2023 18:27:37 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.17.1/8.17.1/Submit) id 3B5IRb4g009395; Tue, 5 Dec 2023 18:27:37 GMT (envelope-from git) Date: Tue, 5 Dec 2023 18:27:37 GMT Message-Id: <202312051827.3B5IRb4g009395@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: Mark Johnston Subject: git: 4be96902ba82 - releng/14.0 - vm_phys: fix freelist_contig List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: markj X-Git-Repository: src X-Git-Refname: refs/heads/releng/14.0 X-Git-Reftype: branch X-Git-Commit: 4be96902ba82364810b86a6a0b3c58065e45e4cd Auto-Submitted: auto-generated The branch releng/14.0 has been updated by markj: URL: https://cgit.FreeBSD.org/src/commit/?id=4be96902ba82364810b86a6a0b3c58065e45e4cd commit 4be96902ba82364810b86a6a0b3c58065e45e4cd Author: Doug Moore AuthorDate: 2023-11-15 09:25:45 +0000 Commit: Mark Johnston CommitDate: 2023-12-04 14:05:55 +0000 vm_phys: fix freelist_contig vm_phys_find_freelist_contig is called to search a list of max-sized free page blocks and find one that, when joined with adjacent blocks in memory, can satisfy a request for a memory allocation bigger than any single max-sized free page block. In commit fa8a6585c7522b7de6d29802967bd5eba2f2dcf1, I defined this function in order to offer two improvements: 1) reduce the worst-case search time, and 2) allow solutions that include less-than max-sized free page blocks at the front or back of the giant allocation. However, it turns out that this change introduced an error, reported in In Bug 274592. That error concerns failing to check segment boundaries. This change fixes an error in vm_phys_find_freelist_config that resolves that bug. It also abandons improvement 2), because the value of that improvement is small and because preserving it would require more testing than I am able to do. PR: 274592 Reported by: shafaisal.us@gmail.com Reviewed by: alc, markj Tested by: shafaisal.us@gmail.com Fixes: fa8a6585c752 vm_phys: avoid waste in multipage allocation MFC after: 10 days Differential Revision: https://reviews.freebsd.org/D42509 Approved by: so Security: FreeBSD-EN-23:20.vm (cherry picked from commit 2a4897bd4e1bd8430d955abd3cf6675956bb9d61) (cherry picked from commit 210fce73ae0e4106a3aeb1970ffbeb30d0baa6ba) --- sys/vm/vm_phys.c | 146 +++++++++++++++++++++++-------------------------------- 1 file changed, 62 insertions(+), 84 deletions(-) diff --git a/sys/vm/vm_phys.c b/sys/vm/vm_phys.c index bc992bdfc58b..cd75ed092691 100644 --- a/sys/vm/vm_phys.c +++ b/sys/vm/vm_phys.c @@ -1360,108 +1360,75 @@ vm_phys_unfree_page(vm_page_t m) } /* - * Find a run of contiguous physical pages from the specified page list. + * Find a run of contiguous physical pages, meeting alignment requirements, from + * a list of max-sized page blocks, where we need at least two consecutive + * blocks to satisfy the (large) page request. */ static vm_page_t -vm_phys_find_freelist_contig(struct vm_freelist *fl, int oind, u_long npages, +vm_phys_find_freelist_contig(struct vm_freelist *fl, u_long npages, vm_paddr_t low, vm_paddr_t high, u_long alignment, vm_paddr_t boundary) { struct vm_phys_seg *seg; - vm_paddr_t frag, lbound, pa, page_size, pa_end, pa_pre, size; - vm_page_t m, m_listed, m_ret; - int order; + vm_page_t m, m_iter, m_ret; + vm_paddr_t max_size, size; + int max_order; - KASSERT(npages > 0, ("npages is 0")); - KASSERT(powerof2(alignment), ("alignment is not a power of 2")); - KASSERT(powerof2(boundary), ("boundary is not a power of 2")); - /* Search for a run satisfying the specified conditions. */ - page_size = PAGE_SIZE; + max_order = VM_NFREEORDER - 1; size = npages << PAGE_SHIFT; - frag = (npages & ~(~0UL << oind)) << PAGE_SHIFT; - TAILQ_FOREACH(m_listed, &fl[oind].pl, listq) { - /* - * Determine if the address range starting at pa is - * too low. - */ - pa = VM_PAGE_TO_PHYS(m_listed); - if (pa < low) - continue; + max_size = (vm_paddr_t)1 << (PAGE_SHIFT + max_order); + KASSERT(size > max_size, ("size is too small")); + /* + * In order to avoid examining any free max-sized page block more than + * twice, identify the ones that are first in a physically-contiguous + * sequence of such blocks, and only for those walk the sequence to + * check if there are enough free blocks starting at a properly aligned + * block. Thus, no block is checked for free-ness more than twice. + */ + TAILQ_FOREACH(m, &fl[max_order].pl, listq) { /* - * If this is not the first free oind-block in this range, bail - * out. We have seen the first free block already, or will see - * it before failing to find an appropriate range. + * Skip m unless it is first in a sequence of free max page + * blocks >= low in its segment. */ - seg = &vm_phys_segs[m_listed->segind]; - lbound = low > seg->start ? low : seg->start; - pa_pre = pa - (page_size << oind); - m = &seg->first_page[atop(pa_pre - seg->start)]; - if (pa != 0 && pa_pre >= lbound && m->order == oind) + seg = &vm_phys_segs[m->segind]; + if (VM_PAGE_TO_PHYS(m) < MAX(low, seg->start)) + continue; + if (VM_PAGE_TO_PHYS(m) >= max_size && + VM_PAGE_TO_PHYS(m) - max_size >= MAX(low, seg->start) && + max_order == m[-1 << max_order].order) continue; - - if (!vm_addr_align_ok(pa, alignment)) - /* Advance to satisfy alignment condition. */ - pa = roundup2(pa, alignment); - else if (frag != 0 && lbound + frag <= pa) { - /* - * Back up to the first aligned free block in this - * range, without moving below lbound. - */ - pa_end = pa; - for (order = oind - 1; order >= 0; order--) { - pa_pre = pa_end - (page_size << order); - if (!vm_addr_align_ok(pa_pre, alignment)) - break; - m = &seg->first_page[atop(pa_pre - seg->start)]; - if (pa_pre >= lbound && m->order == order) - pa_end = pa_pre; - } - /* - * If the extra small blocks are enough to complete the - * fragment, use them. Otherwise, look to allocate the - * fragment at the other end. - */ - if (pa_end + frag <= pa) - pa = pa_end; - } - - /* Advance as necessary to satisfy boundary conditions. */ - if (!vm_addr_bound_ok(pa, size, boundary)) - pa = roundup2(pa + 1, boundary); - pa_end = pa + size; /* - * Determine if the address range is valid (without overflow in - * pa_end calculation), and fits within the segment. + * Advance m_ret from m to the first of the sequence, if any, + * that satisfies alignment conditions and might leave enough + * space. */ - if (pa_end < pa || seg->end < pa_end) - continue; - - m_ret = &seg->first_page[atop(pa - seg->start)]; + m_ret = m; + while (!vm_addr_ok(VM_PAGE_TO_PHYS(m_ret), + size, alignment, boundary) && + VM_PAGE_TO_PHYS(m_ret) + size <= MIN(high, seg->end) && + max_order == m_ret[1 << max_order].order) + m_ret += 1 << max_order; /* - * Determine whether there are enough free oind-blocks here to - * satisfy the allocation request. + * Skip m unless some block m_ret in the sequence is properly + * aligned, and begins a sequence of enough pages less than + * high, and in the same segment. */ - pa = VM_PAGE_TO_PHYS(m_listed); - do { - pa += page_size << oind; - if (pa >= pa_end) - return (m_ret); - m = &seg->first_page[atop(pa - seg->start)]; - } while (oind == m->order); + if (VM_PAGE_TO_PHYS(m_ret) + size > MIN(high, seg->end)) + continue; /* - * Determine if an additional series of free blocks of - * diminishing size can help to satisfy the allocation request. + * Skip m unless the blocks to allocate starting at m_ret are + * all free. */ - while (m->order < oind && - pa + 2 * (page_size << m->order) > pa_end) { - pa += page_size << m->order; - if (pa >= pa_end) - return (m_ret); - m = &seg->first_page[atop(pa - seg->start)]; + for (m_iter = m_ret; + m_iter < m_ret + npages && max_order == m_iter->order; + m_iter += 1 << max_order) { } + if (m_iter < m_ret + npages) + continue; + return (m_ret); } return (NULL); } @@ -1508,11 +1475,10 @@ vm_phys_find_queues_contig( } if (order < VM_NFREEORDER) return (NULL); - /* Search for a long-enough sequence of small blocks. */ - oind = VM_NFREEORDER - 1; + /* Search for a long-enough sequence of max-order blocks. */ for (pind = 0; pind < VM_NFREEPOOL; pind++) { fl = (*queues)[pind]; - m_ret = vm_phys_find_freelist_contig(fl, oind, npages, + m_ret = vm_phys_find_freelist_contig(fl, npages, low, high, alignment, boundary); if (m_ret != NULL) return (m_ret); @@ -1593,6 +1559,18 @@ vm_phys_alloc_contig(int domain, u_long npages, vm_paddr_t low, vm_paddr_t high, /* Return excess pages to the free lists. */ fl = (*queues)[VM_FREEPOOL_DEFAULT]; vm_phys_enq_range(&m_run[npages], m - &m_run[npages], fl, 0); + + /* Return page verified to satisfy conditions of request. */ + pa_start = VM_PAGE_TO_PHYS(m_run); + KASSERT(low <= pa_start, + ("memory allocated below minimum requested range")); + KASSERT(pa_start + ptoa(npages) <= high, + ("memory allocated above maximum requested range")); + seg = &vm_phys_segs[m_run->segind]; + KASSERT(seg->domain == domain, + ("memory not allocated from specified domain")); + KASSERT(vm_addr_ok(pa_start, ptoa(npages), alignment, boundary), + ("memory alignment/boundary constraints not satisfied")); return (m_run); }