From owner-svn-src-user@freebsd.org Mon Nov 13 03:24:58 2017 Return-Path: <owner-svn-src-user@freebsd.org> Delivered-To: svn-src-user@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9EE35CFDC7E for <svn-src-user@mailman.ysv.freebsd.org>; Mon, 13 Nov 2017 03:24:58 +0000 (UTC) (envelope-from jeff@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4D30A7F81D; Mon, 13 Nov 2017 03:24:58 +0000 (UTC) (envelope-from jeff@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id vAD3OvKE090019; Mon, 13 Nov 2017 03:24:57 GMT (envelope-from jeff@FreeBSD.org) Received: (from jeff@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id vAD3Ovhu090018; Mon, 13 Nov 2017 03:24:57 GMT (envelope-from jeff@FreeBSD.org) Message-Id: <201711130324.vAD3Ovhu090018@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: jeff set sender to jeff@FreeBSD.org using -f From: Jeff Roberson <jeff@FreeBSD.org> Date: Mon, 13 Nov 2017 03:24:57 +0000 (UTC) To: src-committers@freebsd.org, svn-src-user@freebsd.org Subject: svn commit: r325751 - user/jeff X-SVN-Group: user X-SVN-Commit-Author: jeff X-SVN-Commit-Paths: user/jeff X-SVN-Commit-Revision: 325751 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-user@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "SVN commit messages for the experimental " user" src tree" <svn-src-user.freebsd.org> List-Unsubscribe: <https://lists.freebsd.org/mailman/options/svn-src-user>, <mailto:svn-src-user-request@freebsd.org?subject=unsubscribe> List-Archive: <http://lists.freebsd.org/pipermail/svn-src-user/> List-Post: <mailto:svn-src-user@freebsd.org> List-Help: <mailto:svn-src-user-request@freebsd.org?subject=help> List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/svn-src-user>, <mailto:svn-src-user-request@freebsd.org?subject=subscribe> X-List-Received-Date: Mon, 13 Nov 2017 03:24:58 -0000 Author: jeff Date: Mon Nov 13 03:24:57 2017 New Revision: 325751 URL: https://svnweb.freebsd.org/changeset/base/325751 Log: Make a directory for my projects Added: user/jeff/ From owner-svn-src-user@freebsd.org Mon Nov 13 03:25:44 2017 Return-Path: <owner-svn-src-user@freebsd.org> Delivered-To: svn-src-user@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B369ACFDC99 for <svn-src-user@mailman.ysv.freebsd.org>; Mon, 13 Nov 2017 03:25:44 +0000 (UTC) (envelope-from jeff@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 814937F8FB; Mon, 13 Nov 2017 03:25:44 +0000 (UTC) (envelope-from jeff@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id vAD3PhXA090099; Mon, 13 Nov 2017 03:25:43 GMT (envelope-from jeff@FreeBSD.org) Received: (from jeff@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id vAD3PhtK090098; Mon, 13 Nov 2017 03:25:43 GMT (envelope-from jeff@FreeBSD.org) Message-Id: <201711130325.vAD3PhtK090098@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: jeff set sender to jeff@FreeBSD.org using -f From: Jeff Roberson <jeff@FreeBSD.org> Date: Mon, 13 Nov 2017 03:25:43 +0000 (UTC) To: src-committers@freebsd.org, svn-src-user@freebsd.org Subject: svn commit: r325752 - user/jeff/numa X-SVN-Group: user X-SVN-Commit-Author: jeff X-SVN-Commit-Paths: user/jeff/numa X-SVN-Commit-Revision: 325752 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-user@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "SVN commit messages for the experimental " user" src tree" <svn-src-user.freebsd.org> List-Unsubscribe: <https://lists.freebsd.org/mailman/options/svn-src-user>, <mailto:svn-src-user-request@freebsd.org?subject=unsubscribe> List-Archive: <http://lists.freebsd.org/pipermail/svn-src-user/> List-Post: <mailto:svn-src-user@freebsd.org> List-Help: <mailto:svn-src-user-request@freebsd.org?subject=help> List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/svn-src-user>, <mailto:svn-src-user-request@freebsd.org?subject=subscribe> X-List-Received-Date: Mon, 13 Nov 2017 03:25:44 -0000 Author: jeff Date: Mon Nov 13 03:25:43 2017 New Revision: 325752 URL: https://svnweb.freebsd.org/changeset/base/325752 Log: Make a staging branch for numa patches Added: - copied from r325751, head/ Directory Properties: user/jeff/numa/ (props changed) From owner-svn-src-user@freebsd.org Mon Nov 13 03:34:57 2017 Return-Path: <owner-svn-src-user@freebsd.org> Delivered-To: svn-src-user@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1D320CFDF1D for <svn-src-user@mailman.ysv.freebsd.org>; Mon, 13 Nov 2017 03:34:57 +0000 (UTC) (envelope-from jeff@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B99DB7FCE9; Mon, 13 Nov 2017 03:34:56 +0000 (UTC) (envelope-from jeff@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id vAD3YtXe094160; Mon, 13 Nov 2017 03:34:55 GMT (envelope-from jeff@FreeBSD.org) Received: (from jeff@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id vAD3YtnW094151; Mon, 13 Nov 2017 03:34:55 GMT (envelope-from jeff@FreeBSD.org) Message-Id: <201711130334.vAD3YtnW094151@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: jeff set sender to jeff@FreeBSD.org using -f From: Jeff Roberson <jeff@FreeBSD.org> Date: Mon, 13 Nov 2017 03:34:55 +0000 (UTC) To: src-committers@freebsd.org, svn-src-user@freebsd.org Subject: svn commit: r325753 - user/jeff/numa/sys/vm X-SVN-Group: user X-SVN-Commit-Author: jeff X-SVN-Commit-Paths: user/jeff/numa/sys/vm X-SVN-Commit-Revision: 325753 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-user@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "SVN commit messages for the experimental " user" src tree" <svn-src-user.freebsd.org> List-Unsubscribe: <https://lists.freebsd.org/mailman/options/svn-src-user>, <mailto:svn-src-user-request@freebsd.org?subject=unsubscribe> List-Archive: <http://lists.freebsd.org/pipermail/svn-src-user/> List-Post: <mailto:svn-src-user@freebsd.org> List-Help: <mailto:svn-src-user-request@freebsd.org?subject=help> List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/svn-src-user>, <mailto:svn-src-user-request@freebsd.org?subject=subscribe> X-List-Received-Date: Mon, 13 Nov 2017 03:34:57 -0000 Author: jeff Date: Mon Nov 13 03:34:55 2017 New Revision: 325753 URL: https://svnweb.freebsd.org/changeset/base/325753 Log: Move NUMA policy iterators into the page allocator layer https://reviews.freebsd.org/D13014 Modified: user/jeff/numa/sys/vm/vm_domain.c user/jeff/numa/sys/vm/vm_domain.h user/jeff/numa/sys/vm/vm_page.c user/jeff/numa/sys/vm/vm_page.h user/jeff/numa/sys/vm/vm_phys.c user/jeff/numa/sys/vm/vm_phys.h user/jeff/numa/sys/vm/vm_reserv.c user/jeff/numa/sys/vm/vm_reserv.h Modified: user/jeff/numa/sys/vm/vm_domain.c ============================================================================== --- user/jeff/numa/sys/vm/vm_domain.c Mon Nov 13 03:25:43 2017 (r325752) +++ user/jeff/numa/sys/vm/vm_domain.c Mon Nov 13 03:34:55 2017 (r325753) @@ -61,6 +61,118 @@ __FBSDID("$FreeBSD$"); #include <vm/vm_domain.h> +/* + * Default to first-touch + round-robin. + */ +static struct mtx vm_default_policy_mtx; +MTX_SYSINIT(vm_default_policy, &vm_default_policy_mtx, "default policy mutex", + MTX_DEF); +#ifdef VM_NUMA_ALLOC +static struct vm_domain_policy vm_default_policy = + VM_DOMAIN_POLICY_STATIC_INITIALISER(VM_POLICY_FIRST_TOUCH_ROUND_ROBIN, 0); +#else +/* Use round-robin so the domain policy code will only try once per allocation */ +static struct vm_domain_policy vm_default_policy = + VM_DOMAIN_POLICY_STATIC_INITIALISER(VM_POLICY_ROUND_ROBIN, 0); +#endif + +static int +sysctl_vm_default_policy(SYSCTL_HANDLER_ARGS) +{ + char policy_name[32]; + int error; + + mtx_lock(&vm_default_policy_mtx); + + /* Map policy to output string */ + switch (vm_default_policy.p.policy) { + case VM_POLICY_FIRST_TOUCH: + strcpy(policy_name, "first-touch"); + break; + case VM_POLICY_FIRST_TOUCH_ROUND_ROBIN: + strcpy(policy_name, "first-touch-rr"); + break; + case VM_POLICY_ROUND_ROBIN: + default: + strcpy(policy_name, "rr"); + break; + } + mtx_unlock(&vm_default_policy_mtx); + + error = sysctl_handle_string(oidp, &policy_name[0], + sizeof(policy_name), req); + if (error != 0 || req->newptr == NULL) + return (error); + + mtx_lock(&vm_default_policy_mtx); + /* Set: match on the subset of policies that make sense as a default */ + if (strcmp("first-touch-rr", policy_name) == 0) { + vm_domain_policy_set(&vm_default_policy, + VM_POLICY_FIRST_TOUCH_ROUND_ROBIN, 0); + } else if (strcmp("first-touch", policy_name) == 0) { + vm_domain_policy_set(&vm_default_policy, + VM_POLICY_FIRST_TOUCH, 0); + } else if (strcmp("rr", policy_name) == 0) { + vm_domain_policy_set(&vm_default_policy, + VM_POLICY_ROUND_ROBIN, 0); + } else { + error = EINVAL; + goto finish; + } + + error = 0; +finish: + mtx_unlock(&vm_default_policy_mtx); + return (error); +} + +SYSCTL_PROC(_vm, OID_AUTO, default_policy, CTLTYPE_STRING | CTLFLAG_RW, + 0, 0, sysctl_vm_default_policy, "A", + "Default policy (rr, first-touch, first-touch-rr"); + +/* + * Initialise a VM domain iterator. + * + * Check the thread policy, then the proc policy, + * then default to the system policy. + */ +void +vm_policy_iterator_init(struct vm_domain_iterator *vi) +{ +#ifdef VM_NUMA_ALLOC + struct vm_domain_policy lcl; +#endif + + vm_domain_iterator_init(vi); + +#ifdef VM_NUMA_ALLOC + /* Copy out the thread policy */ + vm_domain_policy_localcopy(&lcl, &curthread->td_vm_dom_policy); + if (lcl.p.policy != VM_POLICY_NONE) { + /* Thread policy is present; use it */ + vm_domain_iterator_set_policy(vi, &lcl); + return; + } + + vm_domain_policy_localcopy(&lcl, + &curthread->td_proc->p_vm_dom_policy); + if (lcl.p.policy != VM_POLICY_NONE) { + /* Process policy is present; use it */ + vm_domain_iterator_set_policy(vi, &lcl); + return; + } +#endif + /* Use system default policy */ + vm_domain_iterator_set_policy(vi, &vm_default_policy); +} + +void +vm_policy_iterator_finish(struct vm_domain_iterator *vi) +{ + + vm_domain_iterator_cleanup(vi); +} + #ifdef VM_NUMA_ALLOC static __inline int vm_domain_rr_selectdomain(int skip_domain) Modified: user/jeff/numa/sys/vm/vm_domain.h ============================================================================== --- user/jeff/numa/sys/vm/vm_domain.h Mon Nov 13 03:25:43 2017 (r325752) +++ user/jeff/numa/sys/vm/vm_domain.h Mon Nov 13 03:34:55 2017 (r325753) @@ -63,4 +63,7 @@ extern int vm_domain_iterator_run(struct vm_domain_ite extern int vm_domain_iterator_isdone(struct vm_domain_iterator *vi); extern int vm_domain_iterator_cleanup(struct vm_domain_iterator *vi); +extern void vm_policy_iterator_init(struct vm_domain_iterator *vi); +extern void vm_policy_iterator_finish(struct vm_domain_iterator *vi); + #endif /* __VM_DOMAIN_H__ */ Modified: user/jeff/numa/sys/vm/vm_page.c ============================================================================== --- user/jeff/numa/sys/vm/vm_page.c Mon Nov 13 03:25:43 2017 (r325752) +++ user/jeff/numa/sys/vm/vm_page.c Mon Nov 13 03:34:55 2017 (r325753) @@ -107,6 +107,7 @@ __FBSDID("$FreeBSD$"); #include <vm/vm.h> #include <vm/pmap.h> #include <vm/vm_param.h> +#include <vm/vm_domain.h> #include <vm/vm_kern.h> #include <vm/vm_object.h> #include <vm/vm_page.h> @@ -1577,6 +1578,16 @@ vm_page_alloc(vm_object_t object, vm_pindex_t pindex, vm_radix_lookup_le(&object->rtree, pindex) : NULL)); } +vm_page_t +vm_page_alloc_domain(vm_object_t object, vm_pindex_t pindex, int domain, + int req) +{ + + return (vm_page_alloc_domain_after(object, pindex, domain, req, + object != NULL ? vm_radix_lookup_le(&object->rtree, pindex) : + NULL)); +} + /* * Allocate a page in the specified object with the given page index. To * optimize insertion of the page into the object, the caller must also specifiy @@ -1584,10 +1595,35 @@ vm_page_alloc(vm_object_t object, vm_pindex_t pindex, * page index, or NULL if no such page exists. */ vm_page_t -vm_page_alloc_after(vm_object_t object, vm_pindex_t pindex, int req, - vm_page_t mpred) +vm_page_alloc_after(vm_object_t object, vm_pindex_t pindex, + int req, vm_page_t mpred) { + struct vm_domain_iterator vi; vm_page_t m; + int domain, wait; + + m = NULL; + vm_policy_iterator_init(&vi); + wait = req & (VM_ALLOC_WAITFAIL | VM_ALLOC_WAITOK); + req &= ~wait; + while ((vm_domain_iterator_run(&vi, &domain)) == 0) { + if (vm_domain_iterator_isdone(&vi)) + req |= wait; + m = vm_page_alloc_domain_after(object, pindex, domain, req, + mpred); + if (m != NULL) + break; + } + vm_policy_iterator_finish(&vi); + + return (m); +} + +vm_page_t +vm_page_alloc_domain_after(vm_object_t object, vm_pindex_t pindex, int domain, + int req, vm_page_t mpred) +{ + vm_page_t m; int flags, req_class; u_int free_count; @@ -1617,6 +1653,7 @@ vm_page_alloc_after(vm_object_t object, vm_pindex_t pi * for the request class. */ again: + m = NULL; mtx_lock(&vm_page_queue_free_mtx); if (vm_cnt.v_free_count > vm_cnt.v_free_reserved || (req_class == VM_ALLOC_SYSTEM && @@ -1629,23 +1666,26 @@ again: #if VM_NRESERVLEVEL > 0 if (object == NULL || (object->flags & (OBJ_COLORED | OBJ_FICTITIOUS)) != OBJ_COLORED || (m = - vm_reserv_alloc_page(object, pindex, mpred)) == NULL) + vm_reserv_alloc_page(object, pindex, domain, + mpred)) == NULL) #endif { /* * If not, allocate it from the free page queues. */ - m = vm_phys_alloc_pages(object != NULL ? + m = vm_phys_alloc_pages(domain, object != NULL ? VM_FREEPOOL_DEFAULT : VM_FREEPOOL_DIRECT, 0); #if VM_NRESERVLEVEL > 0 - if (m == NULL && vm_reserv_reclaim_inactive()) { - m = vm_phys_alloc_pages(object != NULL ? + if (m == NULL && vm_reserv_reclaim_inactive(domain)) { + m = vm_phys_alloc_pages(domain, + object != NULL ? VM_FREEPOOL_DEFAULT : VM_FREEPOOL_DIRECT, 0); } #endif } - } else { + } + if (m == NULL) { /* * Not allocatable, give up. */ @@ -1773,6 +1813,32 @@ vm_page_alloc_contig(vm_object_t object, vm_pindex_t p u_long npages, vm_paddr_t low, vm_paddr_t high, u_long alignment, vm_paddr_t boundary, vm_memattr_t memattr) { + struct vm_domain_iterator vi; + vm_page_t m; + int domain, wait; + + m = NULL; + vm_policy_iterator_init(&vi); + wait = req & (VM_ALLOC_WAITFAIL | VM_ALLOC_WAITOK); + req &= ~wait; + while ((vm_domain_iterator_run(&vi, &domain)) == 0) { + if (vm_domain_iterator_isdone(&vi)) + req |= wait; + m = vm_page_alloc_contig_domain(object, pindex, domain, req, + npages, low, high, alignment, boundary, memattr); + if (m != NULL) + break; + } + vm_policy_iterator_finish(&vi); + + return (m); +} + +vm_page_t +vm_page_alloc_contig_domain(vm_object_t object, vm_pindex_t pindex, int domain, + int req, u_long npages, vm_paddr_t low, vm_paddr_t high, u_long alignment, + vm_paddr_t boundary, vm_memattr_t memattr) +{ vm_page_t m, m_ret, mpred; u_int busy_lock, flags, oflags; int req_class; @@ -1812,6 +1878,7 @@ vm_page_alloc_contig(vm_object_t object, vm_pindex_t p * below the lower bound for the allocation class? */ again: + m_ret = NULL; mtx_lock(&vm_page_queue_free_mtx); if (vm_cnt.v_free_count >= npages + vm_cnt.v_free_reserved || (req_class == VM_ALLOC_SYSTEM && @@ -1824,31 +1891,27 @@ again: #if VM_NRESERVLEVEL > 0 retry: if (object == NULL || (object->flags & OBJ_COLORED) == 0 || - (m_ret = vm_reserv_alloc_contig(object, pindex, npages, - low, high, alignment, boundary, mpred)) == NULL) + (m_ret = vm_reserv_alloc_contig(object, pindex, domain, + npages, low, high, alignment, boundary, mpred)) == NULL) #endif /* * If not, allocate them from the free page queues. */ - m_ret = vm_phys_alloc_contig(npages, low, high, + m_ret = vm_phys_alloc_contig(domain, npages, low, high, alignment, boundary); - } else { - if (vm_page_alloc_fail(object, req)) - goto again; - return (NULL); - } - if (m_ret != NULL) - vm_phys_freecnt_adj(m_ret, -npages); - else { #if VM_NRESERVLEVEL > 0 - if (vm_reserv_reclaim_contig(npages, low, high, alignment, - boundary)) + if (m_ret == NULL && vm_reserv_reclaim_contig( + domain, npages, low, high, alignment, boundary)) goto retry; #endif } - mtx_unlock(&vm_page_queue_free_mtx); - if (m_ret == NULL) + if (m_ret == NULL) { + if (vm_page_alloc_fail(object, req)) + goto again; return (NULL); + } + vm_phys_freecnt_adj(m_ret, -npages); + mtx_unlock(&vm_page_queue_free_mtx); for (m = m_ret; m < &m_ret[npages]; m++) vm_page_alloc_check(m); @@ -1962,7 +2025,30 @@ vm_page_alloc_check(vm_page_t m) vm_page_t vm_page_alloc_freelist(int flind, int req) { + struct vm_domain_iterator vi; vm_page_t m; + int domain, wait; + + m = NULL; + vm_policy_iterator_init(&vi); + wait = req & (VM_ALLOC_WAITFAIL | VM_ALLOC_WAITOK); + req &= ~wait; + while ((vm_domain_iterator_run(&vi, &domain)) == 0) { + if (vm_domain_iterator_isdone(&vi)) + req |= wait; + m = vm_page_alloc_freelist_domain(domain, flind, req); + if (m != NULL) + break; + } + vm_policy_iterator_finish(&vi); + + return (m); +} + +vm_page_t +vm_page_alloc_freelist_domain(int domain, int flind, int req) +{ + vm_page_t m; u_int flags, free_count; int req_class; @@ -1983,15 +2069,12 @@ again: (req_class == VM_ALLOC_SYSTEM && vm_cnt.v_free_count > vm_cnt.v_interrupt_free_min) || (req_class == VM_ALLOC_INTERRUPT && - vm_cnt.v_free_count > 0)) { - m = vm_phys_alloc_freelist_pages(flind, VM_FREEPOOL_DIRECT, 0); - } else { + vm_cnt.v_free_count > 0)) + m = vm_phys_alloc_freelist_pages(domain, flind, + VM_FREEPOOL_DIRECT, 0); + if (m == NULL) { if (vm_page_alloc_fail(NULL, req)) goto again; - return (NULL); - } - if (m == NULL) { - mtx_unlock(&vm_page_queue_free_mtx); return (NULL); } free_count = vm_phys_freecnt_adj(m, -1); Modified: user/jeff/numa/sys/vm/vm_page.h ============================================================================== --- user/jeff/numa/sys/vm/vm_page.h Mon Nov 13 03:25:43 2017 (r325752) +++ user/jeff/numa/sys/vm/vm_page.h Mon Nov 13 03:34:55 2017 (r325753) @@ -474,16 +474,24 @@ void vm_page_free_zero(vm_page_t m); void vm_page_activate (vm_page_t); void vm_page_advise(vm_page_t m, int advice); vm_page_t vm_page_alloc(vm_object_t, vm_pindex_t, int); +vm_page_t vm_page_alloc_domain(vm_object_t, vm_pindex_t, int, int); vm_page_t vm_page_alloc_after(vm_object_t, vm_pindex_t, int, vm_page_t); +vm_page_t vm_page_alloc_domain_after(vm_object_t, vm_pindex_t, int, int, + vm_page_t); vm_page_t vm_page_alloc_contig(vm_object_t object, vm_pindex_t pindex, int req, u_long npages, vm_paddr_t low, vm_paddr_t high, u_long alignment, vm_paddr_t boundary, vm_memattr_t memattr); +vm_page_t vm_page_alloc_contig_domain(vm_object_t object, + vm_pindex_t pindex, int domain, int req, u_long npages, vm_paddr_t low, + vm_paddr_t high, u_long alignment, vm_paddr_t boundary, + vm_memattr_t memattr); vm_page_t vm_page_alloc_freelist(int, int); +vm_page_t vm_page_alloc_freelist_domain(int, int, int); void vm_page_change_lock(vm_page_t m, struct mtx **mtx); vm_page_t vm_page_grab (vm_object_t, vm_pindex_t, int); int vm_page_grab_pages(vm_object_t object, vm_pindex_t pindex, int allocflags, vm_page_t *ma, int count); -void vm_page_deactivate (vm_page_t); +void vm_page_deactivate(vm_page_t); void vm_page_deactivate_noreuse(vm_page_t); void vm_page_dequeue(vm_page_t m); void vm_page_dequeue_locked(vm_page_t m); @@ -504,6 +512,8 @@ void vm_page_putfake(vm_page_t m); void vm_page_readahead_finish(vm_page_t m); bool vm_page_reclaim_contig(int req, u_long npages, vm_paddr_t low, vm_paddr_t high, u_long alignment, vm_paddr_t boundary); +bool vm_page_reclaim_contig_domain(int req, u_long npages, int domain, + vm_paddr_t low, vm_paddr_t high, u_long alignment, vm_paddr_t boundary); void vm_page_reference(vm_page_t m); void vm_page_remove (vm_page_t); int vm_page_rename (vm_page_t, vm_object_t, vm_pindex_t); Modified: user/jeff/numa/sys/vm/vm_phys.c ============================================================================== --- user/jeff/numa/sys/vm/vm_phys.c Mon Nov 13 03:25:43 2017 (r325752) +++ user/jeff/numa/sys/vm/vm_phys.c Mon Nov 13 03:34:55 2017 (r325753) @@ -149,23 +149,6 @@ SYSCTL_OID(_vm, OID_AUTO, phys_locality, CTLTYPE_STRIN SYSCTL_INT(_vm, OID_AUTO, ndomains, CTLFLAG_RD, &vm_ndomains, 0, "Number of physical memory domains available."); -/* - * Default to first-touch + round-robin. - */ -static struct mtx vm_default_policy_mtx; -MTX_SYSINIT(vm_default_policy, &vm_default_policy_mtx, "default policy mutex", - MTX_DEF); -#ifdef VM_NUMA_ALLOC -static struct vm_domain_policy vm_default_policy = - VM_DOMAIN_POLICY_STATIC_INITIALISER(VM_POLICY_FIRST_TOUCH_ROUND_ROBIN, 0); -#else -/* Use round-robin so the domain policy code will only try once per allocation */ -static struct vm_domain_policy vm_default_policy = - VM_DOMAIN_POLICY_STATIC_INITIALISER(VM_POLICY_ROUND_ROBIN, 0); -#endif - -static vm_page_t vm_phys_alloc_domain_pages(int domain, int flind, int pool, - int order); static vm_page_t vm_phys_alloc_seg_contig(struct vm_phys_seg *seg, u_long npages, vm_paddr_t low, vm_paddr_t high, u_long alignment, vm_paddr_t boundary); @@ -175,60 +158,6 @@ static int vm_phys_paddr_to_segind(vm_paddr_t pa); static void vm_phys_split_pages(vm_page_t m, int oind, struct vm_freelist *fl, int order); -static int -sysctl_vm_default_policy(SYSCTL_HANDLER_ARGS) -{ - char policy_name[32]; - int error; - - mtx_lock(&vm_default_policy_mtx); - - /* Map policy to output string */ - switch (vm_default_policy.p.policy) { - case VM_POLICY_FIRST_TOUCH: - strcpy(policy_name, "first-touch"); - break; - case VM_POLICY_FIRST_TOUCH_ROUND_ROBIN: - strcpy(policy_name, "first-touch-rr"); - break; - case VM_POLICY_ROUND_ROBIN: - default: - strcpy(policy_name, "rr"); - break; - } - mtx_unlock(&vm_default_policy_mtx); - - error = sysctl_handle_string(oidp, &policy_name[0], - sizeof(policy_name), req); - if (error != 0 || req->newptr == NULL) - return (error); - - mtx_lock(&vm_default_policy_mtx); - /* Set: match on the subset of policies that make sense as a default */ - if (strcmp("first-touch-rr", policy_name) == 0) { - vm_domain_policy_set(&vm_default_policy, - VM_POLICY_FIRST_TOUCH_ROUND_ROBIN, 0); - } else if (strcmp("first-touch", policy_name) == 0) { - vm_domain_policy_set(&vm_default_policy, - VM_POLICY_FIRST_TOUCH, 0); - } else if (strcmp("rr", policy_name) == 0) { - vm_domain_policy_set(&vm_default_policy, - VM_POLICY_ROUND_ROBIN, 0); - } else { - error = EINVAL; - goto finish; - } - - error = 0; -finish: - mtx_unlock(&vm_default_policy_mtx); - return (error); -} - -SYSCTL_PROC(_vm, OID_AUTO, default_policy, CTLTYPE_STRING | CTLFLAG_RW, - 0, 0, sysctl_vm_default_policy, "A", - "Default policy (rr, first-touch, first-touch-rr"); - /* * Red-black tree helpers for vm fictitious range management. */ @@ -270,71 +199,6 @@ vm_phys_fictitious_cmp(struct vm_phys_fictitious_seg * (uintmax_t)p1->end, (uintmax_t)p2->start, (uintmax_t)p2->end); } -#ifdef notyet -static __inline int -vm_rr_selectdomain(void) -{ -#ifdef VM_NUMA_ALLOC - struct thread *td; - - td = curthread; - - td->td_dom_rr_idx++; - td->td_dom_rr_idx %= vm_ndomains; - return (td->td_dom_rr_idx); -#else - return (0); -#endif -} -#endif /* notyet */ - -/* - * Initialise a VM domain iterator. - * - * Check the thread policy, then the proc policy, - * then default to the system policy. - * - * Later on the various layers will have this logic - * plumbed into them and the phys code will be explicitly - * handed a VM domain policy to use. - */ -static void -vm_policy_iterator_init(struct vm_domain_iterator *vi) -{ -#ifdef VM_NUMA_ALLOC - struct vm_domain_policy lcl; -#endif - - vm_domain_iterator_init(vi); - -#ifdef VM_NUMA_ALLOC - /* Copy out the thread policy */ - vm_domain_policy_localcopy(&lcl, &curthread->td_vm_dom_policy); - if (lcl.p.policy != VM_POLICY_NONE) { - /* Thread policy is present; use it */ - vm_domain_iterator_set_policy(vi, &lcl); - return; - } - - vm_domain_policy_localcopy(&lcl, - &curthread->td_proc->p_vm_dom_policy); - if (lcl.p.policy != VM_POLICY_NONE) { - /* Process policy is present; use it */ - vm_domain_iterator_set_policy(vi, &lcl); - return; - } -#endif - /* Use system default policy */ - vm_domain_iterator_set_policy(vi, &vm_default_policy); -} - -static void -vm_policy_iterator_finish(struct vm_domain_iterator *vi) -{ - - vm_domain_iterator_cleanup(vi); -} - boolean_t vm_phys_domain_intersects(long mask, vm_paddr_t low, vm_paddr_t high) { @@ -503,7 +367,7 @@ _vm_phys_create_seg(vm_paddr_t start, vm_paddr_t end, KASSERT(vm_phys_nsegs < VM_PHYSSEG_MAX, ("vm_phys_create_seg: increase VM_PHYSSEG_MAX")); - KASSERT(domain < vm_ndomains, + KASSERT(domain >= 0 && domain < vm_ndomains, ("vm_phys_create_seg: invalid domain provided")); seg = &vm_phys_segs[vm_phys_nsegs++]; while (seg > vm_phys_segs && (seg - 1)->start >= end) { @@ -760,29 +624,16 @@ vm_phys_init_page(vm_paddr_t pa) * The free page queues must be locked. */ vm_page_t -vm_phys_alloc_pages(int pool, int order) +vm_phys_alloc_pages(int domain, int pool, int order) { vm_page_t m; - int domain, flind; - struct vm_domain_iterator vi; + int flind; - KASSERT(pool < VM_NFREEPOOL, - ("vm_phys_alloc_pages: pool %d is out of range", pool)); - KASSERT(order < VM_NFREEORDER, - ("vm_phys_alloc_pages: order %d is out of range", order)); - - vm_policy_iterator_init(&vi); - - while ((vm_domain_iterator_run(&vi, &domain)) == 0) { - for (flind = 0; flind < vm_nfreelists; flind++) { - m = vm_phys_alloc_domain_pages(domain, flind, pool, - order); - if (m != NULL) - return (m); - } + for (flind = 0; flind < vm_nfreelists; flind++) { + m = vm_phys_alloc_freelist_pages(domain, flind, pool, order); + if (m != NULL) + return (m); } - - vm_policy_iterator_finish(&vi); return (NULL); } @@ -794,41 +645,23 @@ vm_phys_alloc_pages(int pool, int order) * The free page queues must be locked. */ vm_page_t -vm_phys_alloc_freelist_pages(int freelist, int pool, int order) +vm_phys_alloc_freelist_pages(int domain, int flind, int pool, int order) { + struct vm_freelist *alt, *fl; vm_page_t m; - struct vm_domain_iterator vi; - int domain; + int oind, pind; - KASSERT(freelist < VM_NFREELIST, + KASSERT(domain >= 0 && domain < vm_ndomains, + ("vm_phys_alloc_freelist_pages: domain %d is out of range", + domain)); + KASSERT(flind < VM_NFREELIST, ("vm_phys_alloc_freelist_pages: freelist %d is out of range", - freelist)); + flind)); KASSERT(pool < VM_NFREEPOOL, ("vm_phys_alloc_freelist_pages: pool %d is out of range", pool)); KASSERT(order < VM_NFREEORDER, ("vm_phys_alloc_freelist_pages: order %d is out of range", order)); - vm_policy_iterator_init(&vi); - - while ((vm_domain_iterator_run(&vi, &domain)) == 0) { - m = vm_phys_alloc_domain_pages(domain, - vm_freelist_to_flind[freelist], pool, order); - if (m != NULL) - return (m); - } - - vm_policy_iterator_finish(&vi); - return (NULL); -} - -static vm_page_t -vm_phys_alloc_domain_pages(int domain, int flind, int pool, int order) -{ - struct vm_freelist *fl; - struct vm_freelist *alt; - int oind, pind; - vm_page_t m; - mtx_assert(&vm_page_queue_free_mtx, MA_OWNED); fl = &vm_phys_free_queues[domain][flind][pool][0]; for (oind = order; oind < VM_NFREEORDER; oind++) { @@ -1303,14 +1136,13 @@ vm_phys_unfree_page(vm_page_t m) * "alignment" and "boundary" must be a power of two. */ vm_page_t -vm_phys_alloc_contig(u_long npages, vm_paddr_t low, vm_paddr_t high, +vm_phys_alloc_contig(int domain, u_long npages, vm_paddr_t low, vm_paddr_t high, u_long alignment, vm_paddr_t boundary) { vm_paddr_t pa_end, pa_start; vm_page_t m_run; - struct vm_domain_iterator vi; struct vm_phys_seg *seg; - int domain, segind; + int segind; KASSERT(npages > 0, ("npages is 0")); KASSERT(powerof2(alignment), ("alignment is not a power of 2")); @@ -1318,12 +1150,6 @@ vm_phys_alloc_contig(u_long npages, vm_paddr_t low, vm mtx_assert(&vm_page_queue_free_mtx, MA_OWNED); if (low >= high) return (NULL); - vm_policy_iterator_init(&vi); -restartdom: - if (vm_domain_iterator_run(&vi, &domain) != 0) { - vm_policy_iterator_finish(&vi); - return (NULL); - } m_run = NULL; for (segind = vm_phys_nsegs - 1; segind >= 0; segind--) { seg = &vm_phys_segs[segind]; @@ -1346,9 +1172,6 @@ restartdom: if (m_run != NULL) break; } - if (m_run == NULL && !vm_domain_iterator_isdone(&vi)) - goto restartdom; - vm_policy_iterator_finish(&vi); return (m_run); } Modified: user/jeff/numa/sys/vm/vm_phys.h ============================================================================== --- user/jeff/numa/sys/vm/vm_phys.h Mon Nov 13 03:25:43 2017 (r325752) +++ user/jeff/numa/sys/vm/vm_phys.h Mon Nov 13 03:34:55 2017 (r325753) @@ -70,10 +70,11 @@ extern int vm_phys_nsegs; * The following functions are only to be used by the virtual memory system. */ void vm_phys_add_seg(vm_paddr_t start, vm_paddr_t end); -vm_page_t vm_phys_alloc_contig(u_long npages, vm_paddr_t low, vm_paddr_t high, - u_long alignment, vm_paddr_t boundary); -vm_page_t vm_phys_alloc_freelist_pages(int freelist, int pool, int order); -vm_page_t vm_phys_alloc_pages(int pool, int order); +vm_page_t vm_phys_alloc_contig(int domain, u_long npages, vm_paddr_t low, + vm_paddr_t high, u_long alignment, vm_paddr_t boundary); +vm_page_t vm_phys_alloc_freelist_pages(int domain, int freelist, int pool, + int order); +vm_page_t vm_phys_alloc_pages(int domain, int pool, int order); boolean_t vm_phys_domain_intersects(long mask, vm_paddr_t low, vm_paddr_t high); int vm_phys_fictitious_reg_range(vm_paddr_t start, vm_paddr_t end, vm_memattr_t memattr); @@ -91,12 +92,13 @@ boolean_t vm_phys_unfree_page(vm_page_t m); int vm_phys_mem_affinity(int f, int t); /* - * vm_phys_domain: * - * Return the memory domain the page belongs to. + * vm_phys_domidx: + * + * Return the index of the domain the page belongs to. */ -static inline struct vm_domain * -vm_phys_domain(vm_page_t m) +static inline int +vm_phys_domidx(vm_page_t m) { #ifdef VM_NUMA_ALLOC int domn, segind; @@ -106,10 +108,22 @@ vm_phys_domain(vm_page_t m) KASSERT(segind < vm_phys_nsegs, ("segind %d m %p", segind, m)); domn = vm_phys_segs[segind].domain; KASSERT(domn < vm_ndomains, ("domain %d m %p", domn, m)); - return (&vm_dom[domn]); + return (domn); #else - return (&vm_dom[0]); + return (0); #endif +} + +/* + * vm_phys_domain: + * + * Return the memory domain the page belongs to. + */ +static inline struct vm_domain * +vm_phys_domain(vm_page_t m) +{ + + return (&vm_dom[vm_phys_domidx(m)]); } static inline u_int Modified: user/jeff/numa/sys/vm/vm_reserv.c ============================================================================== --- user/jeff/numa/sys/vm/vm_reserv.c Mon Nov 13 03:25:43 2017 (r325752) +++ user/jeff/numa/sys/vm/vm_reserv.c Mon Nov 13 03:34:55 2017 (r325753) @@ -168,6 +168,7 @@ struct vm_reserv { vm_object_t object; /* containing object */ vm_pindex_t pindex; /* offset within object */ vm_page_t pages; /* first page of a superpage */ + int domain; /* NUMA domain */ int popcnt; /* # of pages in use */ char inpartpopq; popmap_t popmap[NPOPMAP]; /* bit vector of used pages */ @@ -205,8 +206,7 @@ static vm_reserv_t vm_reserv_array; * * Access to this queue is synchronized by the free page queue lock. */ -static TAILQ_HEAD(, vm_reserv) vm_rvq_partpop = - TAILQ_HEAD_INITIALIZER(vm_rvq_partpop); +static TAILQ_HEAD(, vm_reserv) vm_rvq_partpop[MAXMEMDOM]; static SYSCTL_NODE(_vm, OID_AUTO, reserv, CTLFLAG_RD, 0, "Reservation Info"); @@ -275,24 +275,27 @@ sysctl_vm_reserv_partpopq(SYSCTL_HANDLER_ARGS) { struct sbuf sbuf; vm_reserv_t rv; - int counter, error, level, unused_pages; + int counter, error, domain, level, unused_pages; error = sysctl_wire_old_buffer(req, 0); if (error != 0) return (error); sbuf_new_for_sysctl(&sbuf, NULL, 128, req); - sbuf_printf(&sbuf, "\nLEVEL SIZE NUMBER\n\n"); - for (level = -1; level <= VM_NRESERVLEVEL - 2; level++) { - counter = 0; - unused_pages = 0; - mtx_lock(&vm_page_queue_free_mtx); - TAILQ_FOREACH(rv, &vm_rvq_partpop/*[level]*/, partpopq) { - counter++; - unused_pages += VM_LEVEL_0_NPAGES - rv->popcnt; + sbuf_printf(&sbuf, "\nDOMAIN LEVEL SIZE NUMBER\n\n"); + for (domain = 0; domain < vm_ndomains; domain++) { + for (level = -1; level <= VM_NRESERVLEVEL - 2; level++) { + counter = 0; + unused_pages = 0; + mtx_lock(&vm_page_queue_free_mtx); + TAILQ_FOREACH(rv, &vm_rvq_partpop[domain], partpopq) { + counter++; + unused_pages += VM_LEVEL_0_NPAGES - rv->popcnt; + } + mtx_unlock(&vm_page_queue_free_mtx); + sbuf_printf(&sbuf, "%6d, %7d, %6dK, %6d\n", + domain, level, + unused_pages * ((int)PAGE_SIZE / 1024), counter); } - mtx_unlock(&vm_page_queue_free_mtx); - sbuf_printf(&sbuf, "%5d: %6dK, %6d\n", level, - unused_pages * ((int)PAGE_SIZE / 1024), counter); } error = sbuf_finish(&sbuf); sbuf_delete(&sbuf); @@ -319,8 +322,11 @@ vm_reserv_depopulate(vm_reserv_t rv, int index) index)); KASSERT(rv->popcnt > 0, ("vm_reserv_depopulate: reserv %p's popcnt is corrupted", rv)); + KASSERT(rv->domain >= 0 && rv->domain < vm_ndomains, + ("vm_reserv_depopulate: reserv %p's domain is corrupted %d", + rv, rv->domain)); if (rv->inpartpopq) { - TAILQ_REMOVE(&vm_rvq_partpop, rv, partpopq); + TAILQ_REMOVE(&vm_rvq_partpop[rv->domain], rv, partpopq); rv->inpartpopq = FALSE; } else { KASSERT(rv->pages->psind == 1, @@ -333,11 +339,12 @@ vm_reserv_depopulate(vm_reserv_t rv, int index) if (rv->popcnt == 0) { LIST_REMOVE(rv, objq); rv->object = NULL; + rv->domain = -1; vm_phys_free_pages(rv->pages, VM_LEVEL_0_ORDER); vm_reserv_freed++; } else { rv->inpartpopq = TRUE; - TAILQ_INSERT_TAIL(&vm_rvq_partpop, rv, partpopq); + TAILQ_INSERT_TAIL(&vm_rvq_partpop[rv->domain], rv, partpopq); } } @@ -382,15 +389,18 @@ vm_reserv_populate(vm_reserv_t rv, int index) ("vm_reserv_populate: reserv %p is already full", rv)); KASSERT(rv->pages->psind == 0, ("vm_reserv_populate: reserv %p is already promoted", rv)); + KASSERT(rv->domain >= 0 && rv->domain < vm_ndomains, + ("vm_reserv_populate: reserv %p's domain is corrupted %d", + rv, rv->domain)); if (rv->inpartpopq) { - TAILQ_REMOVE(&vm_rvq_partpop, rv, partpopq); + TAILQ_REMOVE(&vm_rvq_partpop[rv->domain], rv, partpopq); rv->inpartpopq = FALSE; } popmap_set(rv->popmap, index); rv->popcnt++; if (rv->popcnt < VM_LEVEL_0_NPAGES) { rv->inpartpopq = TRUE; - TAILQ_INSERT_TAIL(&vm_rvq_partpop, rv, partpopq); + TAILQ_INSERT_TAIL(&vm_rvq_partpop[rv->domain], rv, partpopq); } else rv->pages->psind = 1; } @@ -411,9 +421,9 @@ vm_reserv_populate(vm_reserv_t rv, int index) * The object and free page queue must be locked. */ vm_page_t -vm_reserv_alloc_contig(vm_object_t object, vm_pindex_t pindex, u_long npages, - vm_paddr_t low, vm_paddr_t high, u_long alignment, vm_paddr_t boundary, - vm_page_t mpred) +vm_reserv_alloc_contig(vm_object_t object, vm_pindex_t pindex, int domain, + u_long npages, vm_paddr_t low, vm_paddr_t high, u_long alignment, + vm_paddr_t boundary, vm_page_t mpred) { vm_paddr_t pa, size; vm_page_t m, m_ret, msucc; @@ -533,7 +543,7 @@ vm_reserv_alloc_contig(vm_object_t object, vm_pindex_t * specified index may not be the first page within the first new * reservation. */ - m = vm_phys_alloc_contig(allocpages, low, high, ulmax(alignment, + m = vm_phys_alloc_contig(domain, allocpages, low, high, ulmax(alignment, VM_LEVEL_0_SIZE), boundary > VM_LEVEL_0_SIZE ? boundary : 0); if (m == NULL) return (NULL); @@ -556,6 +566,7 @@ vm_reserv_alloc_contig(vm_object_t object, vm_pindex_t LIST_INSERT_HEAD(&object->rvq, rv, objq); rv->object = object; rv->pindex = first; + rv->domain = vm_phys_domidx(m); KASSERT(rv->popcnt == 0, ("vm_reserv_alloc_contig: reserv %p's popcnt is corrupted", rv)); @@ -611,7 +622,8 @@ found: * The object and free page queue must be locked. */ vm_page_t -vm_reserv_alloc_page(vm_object_t object, vm_pindex_t pindex, vm_page_t mpred) +vm_reserv_alloc_page(vm_object_t object, vm_pindex_t pindex, int domain, + vm_page_t mpred) { vm_page_t m, msucc; vm_pindex_t first, leftcap, rightcap; @@ -690,7 +702,7 @@ vm_reserv_alloc_page(vm_object_t object, vm_pindex_t p /* * Allocate and populate the new reservation. */ - m = vm_phys_alloc_pages(VM_FREEPOOL_DEFAULT, VM_LEVEL_0_ORDER); + m = vm_phys_alloc_pages(domain, VM_FREEPOOL_DEFAULT, VM_LEVEL_0_ORDER); if (m == NULL) return (NULL); rv = vm_reserv_from_page(m); @@ -701,6 +713,7 @@ vm_reserv_alloc_page(vm_object_t object, vm_pindex_t p LIST_INSERT_HEAD(&object->rvq, rv, objq); rv->object = object; rv->pindex = first; + rv->domain = vm_phys_domidx(m); KASSERT(rv->popcnt == 0, ("vm_reserv_alloc_page: reserv %p's popcnt is corrupted", rv)); KASSERT(!rv->inpartpopq, @@ -747,6 +760,7 @@ vm_reserv_break(vm_reserv_t rv, vm_page_t m) ("vm_reserv_break: reserv %p's inpartpopq is TRUE", rv)); LIST_REMOVE(rv, objq); rv->object = NULL; + rv->domain = -1; if (m != NULL) { /* * Since the reservation is being broken, there is no harm in @@ -816,7 +830,7 @@ vm_reserv_break_all(vm_object_t object) KASSERT(rv->object == object, ("vm_reserv_break_all: reserv %p is corrupted", rv)); if (rv->inpartpopq) { - TAILQ_REMOVE(&vm_rvq_partpop, rv, partpopq); + TAILQ_REMOVE(&vm_rvq_partpop[rv->domain], rv, partpopq); rv->inpartpopq = FALSE; } vm_reserv_break(rv, NULL); @@ -854,7 +868,7 @@ vm_reserv_init(void) { vm_paddr_t paddr; struct vm_phys_seg *seg; - int segind; + int i, segind; /* * Initialize the reservation array. Specifically, initialize the @@ -869,6 +883,8 @@ vm_reserv_init(void) paddr += VM_LEVEL_0_SIZE; } } + for (i = 0; i < MAXMEMDOM; i++) + TAILQ_INIT(&vm_rvq_partpop[i]); } /* @@ -926,7 +942,10 @@ vm_reserv_reclaim(vm_reserv_t rv) mtx_assert(&vm_page_queue_free_mtx, MA_OWNED); KASSERT(rv->inpartpopq, ("vm_reserv_reclaim: reserv %p's inpartpopq is FALSE", rv)); - TAILQ_REMOVE(&vm_rvq_partpop, rv, partpopq); + KASSERT(rv->domain >= 0 && rv->domain < vm_ndomains, + ("vm_reserv_reclaim: reserv %p's domain is corrupted %d", + rv, rv->domain)); + TAILQ_REMOVE(&vm_rvq_partpop[rv->domain], rv, partpopq); rv->inpartpopq = FALSE; vm_reserv_break(rv, NULL); vm_reserv_reclaimed++; @@ -940,12 +959,12 @@ vm_reserv_reclaim(vm_reserv_t rv) * The free page queue lock must be held. */ boolean_t -vm_reserv_reclaim_inactive(void) +vm_reserv_reclaim_inactive(int domain) { vm_reserv_t rv; mtx_assert(&vm_page_queue_free_mtx, MA_OWNED); - if ((rv = TAILQ_FIRST(&vm_rvq_partpop)) != NULL) { + if ((rv = TAILQ_FIRST(&vm_rvq_partpop[domain])) != NULL) { vm_reserv_reclaim(rv); return (TRUE); } @@ -961,8 +980,8 @@ vm_reserv_reclaim_inactive(void) * The free page queue lock must be held. */ boolean_t -vm_reserv_reclaim_contig(u_long npages, vm_paddr_t low, vm_paddr_t high, - u_long alignment, vm_paddr_t boundary) +vm_reserv_reclaim_contig(int domain, u_long npages, vm_paddr_t low, + vm_paddr_t high, u_long alignment, vm_paddr_t boundary) { vm_paddr_t pa, size; *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** From owner-svn-src-user@freebsd.org Mon Nov 13 03:41:52 2017 Return-Path: <owner-svn-src-user@freebsd.org> Delivered-To: svn-src-user@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8C704CFE059 for <svn-src-user@mailman.ysv.freebsd.org>; Mon, 13 Nov 2017 03:41:52 +0000 (UTC) (envelope-from jeff@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 270B3801DE; Mon, 13 Nov 2017 03:41:52 +0000 (UTC) (envelope-from jeff@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id vAD3fp4c097660; Mon, 13 Nov 2017 03:41:51 GMT (envelope-from jeff@FreeBSD.org) Received: (from jeff@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id vAD3foUw097650; Mon, 13 Nov 2017 03:41:50 GMT (envelope-from jeff@FreeBSD.org) Message-Id: <201711130341.vAD3foUw097650@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: jeff set sender to jeff@FreeBSD.org using -f From: Jeff Roberson <jeff@FreeBSD.org> Date: Mon, 13 Nov 2017 03:41:50 +0000 (UTC) To: src-committers@freebsd.org, svn-src-user@freebsd.org Subject: svn commit: r325754 - in user/jeff/numa/sys: kern sys vm X-SVN-Group: user X-SVN-Commit-Author: jeff X-SVN-Commit-Paths: in user/jeff/numa/sys: kern sys vm X-SVN-Commit-Revision: 325754 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-user@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "SVN commit messages for the experimental " user" src tree" <svn-src-user.freebsd.org> List-Unsubscribe: <https://lists.freebsd.org/mailman/options/svn-src-user>, <mailto:svn-src-user-request@freebsd.org?subject=unsubscribe> List-Archive: <http://lists.freebsd.org/pipermail/svn-src-user/> List-Post: <mailto:svn-src-user@freebsd.org> List-Help: <mailto:svn-src-user-request@freebsd.org?subject=help> List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/svn-src-user>, <mailto:svn-src-user-request@freebsd.org?subject=subscribe> X-List-Received-Date: Mon, 13 Nov 2017 03:41:52 -0000 Author: jeff Date: Mon Nov 13 03:41:50 2017 New Revision: 325754 URL: https://svnweb.freebsd.org/changeset/base/325754 Log: Eliminate kmem_arena to simplify the kmem_ api for forthcoming NUMA support Modified: user/jeff/numa/sys/kern/kern_malloc.c user/jeff/numa/sys/kern/subr_vmem.c user/jeff/numa/sys/sys/vmem.h user/jeff/numa/sys/vm/memguard.c user/jeff/numa/sys/vm/uma.h user/jeff/numa/sys/vm/uma_core.c user/jeff/numa/sys/vm/vm_kern.c user/jeff/numa/sys/vm/vm_map.c user/jeff/numa/sys/vm/vm_object.c user/jeff/numa/sys/vm/vm_object.h Modified: user/jeff/numa/sys/kern/kern_malloc.c ============================================================================== --- user/jeff/numa/sys/kern/kern_malloc.c Mon Nov 13 03:34:55 2017 (r325753) +++ user/jeff/numa/sys/kern/kern_malloc.c Mon Nov 13 03:41:50 2017 (r325754) @@ -237,7 +237,7 @@ sysctl_kmem_map_size(SYSCTL_HANDLER_ARGS) { u_long size; - size = vmem_size(kmem_arena, VMEM_ALLOC); + size = vmem_size(kernel_arena, VMEM_ALLOC); return (sysctl_handle_long(oidp, &size, 0, req)); } @@ -246,7 +246,7 @@ sysctl_kmem_map_free(SYSCTL_HANDLER_ARGS) { u_long size; - size = vmem_size(kmem_arena, VMEM_FREE); + size = vmem_size(kernel_arena, VMEM_FREE); return (sysctl_handle_long(oidp, &size, 0, req)); } @@ -757,9 +757,8 @@ kmeminit(void) #else tmp = vm_kmem_size; #endif - vmem_init(kmem_arena, "kmem arena", kva_alloc(tmp), tmp, PAGE_SIZE, - 0, 0); - vmem_set_reclaim(kmem_arena, kmem_reclaim); + vmem_set_limit(kernel_arena, tmp); + vmem_set_reclaim(kernel_arena, kmem_reclaim); #ifdef DEBUG_MEMGUARD /* @@ -767,7 +766,7 @@ kmeminit(void) * replacement allocator used for detecting tamper-after-free * scenarios as they occur. It is only used for debugging. */ - memguard_init(kmem_arena); + memguard_init(kernel_arena); #endif } Modified: user/jeff/numa/sys/kern/subr_vmem.c ============================================================================== --- user/jeff/numa/sys/kern/subr_vmem.c Mon Nov 13 03:34:55 2017 (r325753) +++ user/jeff/numa/sys/kern/subr_vmem.c Mon Nov 13 03:41:50 2017 (r325754) @@ -135,6 +135,7 @@ struct vmem { int vm_nbusytag; vmem_size_t vm_inuse; vmem_size_t vm_size; + vmem_size_t vm_limit; /* Used on import. */ vmem_import_t *vm_importfn; @@ -226,11 +227,11 @@ static uma_zone_t vmem_bt_zone; /* boot time arena storage. */ static struct vmem kernel_arena_storage; -static struct vmem kmem_arena_storage; static struct vmem buffer_arena_storage; static struct vmem transient_arena_storage; +/* kernel and kmem arenas are aliased for backwards KPI compat. */ vmem_t *kernel_arena = &kernel_arena_storage; -vmem_t *kmem_arena = &kmem_arena_storage; +vmem_t *kmem_arena = &kernel_arena_storage; vmem_t *buffer_arena = &buffer_arena_storage; vmem_t *transient_arena = &transient_arena_storage; @@ -252,11 +253,11 @@ bt_fill(vmem_t *vm, int flags) VMEM_ASSERT_LOCKED(vm); /* - * Only allow the kmem arena to dip into reserve tags. It is the + * Only allow the kernel arena to dip into reserve tags. It is the * vmem where new tags come from. */ flags &= BT_FLAGS; - if (vm != kmem_arena) + if (vm != kernel_arena) flags &= ~M_USE_RESERVE; /* @@ -613,22 +614,22 @@ vmem_bt_alloc(uma_zone_t zone, vm_size_t bytes, uint8_ { vmem_addr_t addr; - *pflag = UMA_SLAB_KMEM; + *pflag = UMA_SLAB_KERNEL; /* * Single thread boundary tag allocation so that the address space * and memory are added in one atomic operation. */ mtx_lock(&vmem_bt_lock); - if (vmem_xalloc(kmem_arena, bytes, 0, 0, 0, VMEM_ADDR_MIN, + if (vmem_xalloc(kernel_arena, bytes, 0, 0, 0, VMEM_ADDR_MIN, VMEM_ADDR_MAX, M_NOWAIT | M_NOVM | M_USE_RESERVE | M_BESTFIT, &addr) == 0) { - if (kmem_back(kmem_object, addr, bytes, + if (kmem_back(kernel_object, addr, bytes, M_NOWAIT | M_USE_RESERVE) == 0) { mtx_unlock(&vmem_bt_lock); return ((void *)addr); } - vmem_xfree(kmem_arena, addr, bytes); + vmem_xfree(kernel_arena, addr, bytes); mtx_unlock(&vmem_bt_lock); /* * Out of memory, not address space. This may not even be @@ -832,6 +833,9 @@ vmem_import(vmem_t *vm, vmem_size_t size, vmem_size_t vmem_addr_t addr; int error; + if (vm->vm_limit != 0 && vm->vm_limit < vm->vm_size + size) + return ENOMEM; + if (vm->vm_importfn == NULL) return EINVAL; @@ -976,6 +980,15 @@ vmem_set_import(vmem_t *vm, vmem_import_t *importfn, } void +vmem_set_limit(vmem_t *vm, vmem_size_t limit) +{ + + VMEM_LOCK(vm); + vm->vm_limit = limit; + VMEM_UNLOCK(vm); +} + +void vmem_set_reclaim(vmem_t *vm, vmem_reclaim_t *reclaimfn) { @@ -1007,6 +1020,7 @@ vmem_init(vmem_t *vm, const char *name, vmem_addr_t ba vm->vm_quantum_shift = flsl(quantum) - 1; vm->vm_nbusytag = 0; vm->vm_size = 0; + vm->vm_limit = 0; vm->vm_inuse = 0; qc_init(vm, qcache_max); Modified: user/jeff/numa/sys/sys/vmem.h ============================================================================== --- user/jeff/numa/sys/sys/vmem.h Mon Nov 13 03:34:55 2017 (r325753) +++ user/jeff/numa/sys/sys/vmem.h Mon Nov 13 03:41:50 2017 (r325754) @@ -74,6 +74,12 @@ void vmem_set_import(vmem_t *vm, vmem_import_t *import vmem_release_t *releasefn, void *arg, vmem_size_t import_quantum); /* + * Set a limit on the total size of a vmem. + */ + +void vmem_set_limit(vmem_t *vm, vmem_size_t limit); + +/* * Set a callback for reclaiming memory when space is exhausted: */ void vmem_set_reclaim(vmem_t *vm, vmem_reclaim_t *reclaimfn); Modified: user/jeff/numa/sys/vm/memguard.c ============================================================================== --- user/jeff/numa/sys/vm/memguard.c Mon Nov 13 03:34:55 2017 (r325753) +++ user/jeff/numa/sys/vm/memguard.c Mon Nov 13 03:41:50 2017 (r325754) @@ -64,7 +64,7 @@ __FBSDID("$FreeBSD$"); static SYSCTL_NODE(_vm, OID_AUTO, memguard, CTLFLAG_RW, NULL, "MemGuard data"); /* - * The vm_memguard_divisor variable controls how much of kmem_map should be + * The vm_memguard_divisor variable controls how much of kernel_arena should be * reserved for MemGuard. */ static u_int vm_memguard_divisor; @@ -155,7 +155,7 @@ SYSCTL_ULONG(_vm_memguard, OID_AUTO, frequency_hits, C /* * Return a fudged value to be used for vm_kmem_size for allocating - * the kmem_map. The memguard memory will be a submap. + * the kernel_arena. The memguard memory will be a submap. */ unsigned long memguard_fudge(unsigned long km_size, const struct vm_map *parent_map) @@ -346,7 +346,7 @@ memguard_alloc(unsigned long req_size, int flags) addr = origaddr; if (do_guard) addr += PAGE_SIZE; - rv = kmem_back(kmem_object, addr, size_p, flags); + rv = kmem_back(kernel_object, addr, size_p, flags); if (rv != KERN_SUCCESS) { vmem_xfree(memguard_arena, origaddr, size_v); memguard_fail_pgs++; @@ -416,7 +416,7 @@ memguard_free(void *ptr) * vm_map lock to serialize updates to memguard_wasted, since * we had the lock at increment. */ - kmem_unback(kmem_object, addr, size); + kmem_unback(kernel_object, addr, size); if (sizev > size) addr -= PAGE_SIZE; vmem_xfree(memguard_arena, addr, sizev); Modified: user/jeff/numa/sys/vm/uma.h ============================================================================== --- user/jeff/numa/sys/vm/uma.h Mon Nov 13 03:34:55 2017 (r325753) +++ user/jeff/numa/sys/vm/uma.h Mon Nov 13 03:41:50 2017 (r325754) @@ -607,12 +607,11 @@ void uma_zone_set_freef(uma_zone_t zone, uma_free free * These flags are setable in the allocf and visible in the freef. */ #define UMA_SLAB_BOOT 0x01 /* Slab alloced from boot pages */ -#define UMA_SLAB_KMEM 0x02 /* Slab alloced from kmem_map */ #define UMA_SLAB_KERNEL 0x04 /* Slab alloced from kernel_map */ #define UMA_SLAB_PRIV 0x08 /* Slab alloced from priv allocator */ #define UMA_SLAB_OFFP 0x10 /* Slab is managed separately */ #define UMA_SLAB_MALLOC 0x20 /* Slab is a large malloc slab */ -/* 0x40 and 0x80 are available */ +/* 0x02, 0x40 and 0x80 are available */ /* * Used to pre-fill a zone with some number of items Modified: user/jeff/numa/sys/vm/uma_core.c ============================================================================== --- user/jeff/numa/sys/vm/uma_core.c Mon Nov 13 03:34:55 2017 (r325753) +++ user/jeff/numa/sys/vm/uma_core.c Mon Nov 13 03:41:50 2017 (r325754) @@ -1077,8 +1077,8 @@ page_alloc(uma_zone_t zone, vm_size_t bytes, uint8_t * { void *p; /* Returned page */ - *pflag = UMA_SLAB_KMEM; - p = (void *) kmem_malloc(kmem_arena, bytes, wait); + *pflag = UMA_SLAB_KERNEL; + p = (void *) kmem_malloc(kernel_arena, bytes, wait); return (p); } @@ -1159,9 +1159,7 @@ page_free(void *mem, vm_size_t size, uint8_t flags) { struct vmem *vmem; - if (flags & UMA_SLAB_KMEM) - vmem = kmem_arena; - else if (flags & UMA_SLAB_KERNEL) + if (flags & UMA_SLAB_KERNEL) vmem = kernel_arena; else panic("UMA: page_free used with invalid flags %x", flags); Modified: user/jeff/numa/sys/vm/vm_kern.c ============================================================================== --- user/jeff/numa/sys/vm/vm_kern.c Mon Nov 13 03:34:55 2017 (r325753) +++ user/jeff/numa/sys/vm/vm_kern.c Mon Nov 13 03:41:50 2017 (r325754) @@ -162,11 +162,13 @@ vm_offset_t kmem_alloc_attr(vmem_t *vmem, vm_size_t size, int flags, vm_paddr_t low, vm_paddr_t high, vm_memattr_t memattr) { - vm_object_t object = vmem == kmem_arena ? kmem_object : kernel_object; + vm_object_t object = kernel_object; vm_offset_t addr, i, offset; vm_page_t m; int pflags, tries; + KASSERT(vmem == kernel_arena, + ("kmem_alloc_attr: Only kernel_arena is supported.")); size = round_page(size); if (vmem_alloc(vmem, size, M_BESTFIT | flags, &addr)) return (0); @@ -218,12 +220,14 @@ kmem_alloc_contig(struct vmem *vmem, vm_size_t size, i vm_paddr_t high, u_long alignment, vm_paddr_t boundary, vm_memattr_t memattr) { - vm_object_t object = vmem == kmem_arena ? kmem_object : kernel_object; + vm_object_t object = kernel_object; vm_offset_t addr, offset, tmp; vm_page_t end_m, m; u_long npages; int pflags, tries; + KASSERT(vmem == kernel_arena, + ("kmem_alloc_contig: Only kernel_arena is supported.")); size = round_page(size); if (vmem_alloc(vmem, size, flags | M_BESTFIT, &addr)) return (0); @@ -312,12 +316,13 @@ kmem_malloc(struct vmem *vmem, vm_size_t size, int fla vm_offset_t addr; int rv; + KASSERT(vmem == kernel_arena, + ("kmem_malloc: Only kernel_arena is supported.")); size = round_page(size); if (vmem_alloc(vmem, size, flags | M_BESTFIT, &addr)) return (0); - rv = kmem_back((vmem == kmem_arena) ? kmem_object : kernel_object, - addr, size, flags); + rv = kmem_back(kernel_object, addr, size, flags); if (rv != KERN_SUCCESS) { vmem_free(vmem, addr, size); return (0); @@ -337,8 +342,8 @@ kmem_back(vm_object_t object, vm_offset_t addr, vm_siz vm_page_t m, mpred; int pflags; - KASSERT(object == kmem_object || object == kernel_object, - ("kmem_back: only supports kernel objects.")); + KASSERT(object == kernel_object, + ("kmem_back: only supports kernel object.")); offset = addr - VM_MIN_KERNEL_ADDRESS; pflags = malloc2vm_flags(flags) | VM_ALLOC_NOBUSY | VM_ALLOC_WIRED; @@ -394,8 +399,8 @@ kmem_unback(vm_object_t object, vm_offset_t addr, vm_s vm_page_t m, next; vm_offset_t end, offset; - KASSERT(object == kmem_object || object == kernel_object, - ("kmem_unback: only supports kernel objects.")); + KASSERT(object == kernel_object, + ("kmem_unback: only supports kernel object.")); pmap_remove(kernel_pmap, addr, addr + size); offset = addr - VM_MIN_KERNEL_ADDRESS; @@ -420,9 +425,10 @@ void kmem_free(struct vmem *vmem, vm_offset_t addr, vm_size_t size) { + KASSERT(vmem == kernel_arena, + ("kmem_free: Only kernel_arena is supported.")); size = round_page(size); - kmem_unback((vmem == kmem_arena) ? kmem_object : kernel_object, - addr, size); + kmem_unback(kernel_object, addr, size); vmem_free(vmem, addr, size); } Modified: user/jeff/numa/sys/vm/vm_map.c ============================================================================== --- user/jeff/numa/sys/vm/vm_map.c Mon Nov 13 03:34:55 2017 (r325753) +++ user/jeff/numa/sys/vm/vm_map.c Mon Nov 13 03:41:50 2017 (r325754) @@ -1187,9 +1187,9 @@ vm_map_insert(vm_map_t map, vm_object_t object, vm_oof vm_inherit_t inheritance; VM_MAP_ASSERT_LOCKED(map); - KASSERT((object != kmem_object && object != kernel_object) || + KASSERT(object != kernel_object || (cow & MAP_COPY_ON_WRITE) == 0, - ("vm_map_insert: kmem or kernel object and COW")); + ("vm_map_insert: kernel object and COW")); KASSERT(object == NULL || (cow & MAP_NOFAULT) == 0, ("vm_map_insert: paradoxical MAP_NOFAULT request")); KASSERT((prot & ~max) == 0, @@ -2988,7 +2988,7 @@ vm_map_entry_delete(vm_map_t map, vm_map_entry_t entry VM_OBJECT_WLOCK(object); if (object->ref_count != 1 && ((object->flags & (OBJ_NOSPLIT | OBJ_ONEMAPPING)) == OBJ_ONEMAPPING || - object == kernel_object || object == kmem_object)) { + object == kernel_object)) { vm_object_collapse(object); /* Modified: user/jeff/numa/sys/vm/vm_object.c ============================================================================== --- user/jeff/numa/sys/vm/vm_object.c Mon Nov 13 03:34:55 2017 (r325753) +++ user/jeff/numa/sys/vm/vm_object.c Mon Nov 13 03:41:50 2017 (r325754) @@ -142,7 +142,6 @@ struct object_q vm_object_list; struct mtx vm_object_list_mtx; /* lock for object list and count */ struct vm_object kernel_object_store; -struct vm_object kmem_object_store; static SYSCTL_NODE(_vm_stats, OID_AUTO, object, CTLFLAG_RD, 0, "VM object stats"); @@ -290,14 +289,6 @@ vm_object_init(void) #if VM_NRESERVLEVEL > 0 kernel_object->flags |= OBJ_COLORED; kernel_object->pg_color = (u_short)atop(VM_MIN_KERNEL_ADDRESS); -#endif - - rw_init(&kmem_object->lock, "kmem vm object"); - _vm_object_allocate(OBJT_PHYS, atop(VM_MAX_KERNEL_ADDRESS - - VM_MIN_KERNEL_ADDRESS), kmem_object); -#if VM_NRESERVLEVEL > 0 - kmem_object->flags |= OBJ_COLORED; - kmem_object->pg_color = (u_short)atop(VM_MIN_KERNEL_ADDRESS); #endif /* Modified: user/jeff/numa/sys/vm/vm_object.h ============================================================================== --- user/jeff/numa/sys/vm/vm_object.h Mon Nov 13 03:34:55 2017 (r325753) +++ user/jeff/numa/sys/vm/vm_object.h Mon Nov 13 03:41:50 2017 (r325754) @@ -225,10 +225,10 @@ extern struct object_q vm_object_list; /* list of allo extern struct mtx vm_object_list_mtx; /* lock for object list and count */ extern struct vm_object kernel_object_store; -extern struct vm_object kmem_object_store; +/* kernel and kmem are aliased for backwards KPI compat. */ #define kernel_object (&kernel_object_store) -#define kmem_object (&kmem_object_store) +#define kmem_object (&kernel_object_store) #define VM_OBJECT_ASSERT_LOCKED(object) \ rw_assert(&(object)->lock, RA_LOCKED) From owner-svn-src-user@freebsd.org Mon Nov 13 23:33:09 2017 Return-Path: <owner-svn-src-user@freebsd.org> Delivered-To: svn-src-user@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5D156DDA066 for <svn-src-user@mailman.ysv.freebsd.org>; Mon, 13 Nov 2017 23:33:09 +0000 (UTC) (envelope-from jeff@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3716E3637; Mon, 13 Nov 2017 23:33:09 +0000 (UTC) (envelope-from jeff@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id vADNX8mO004146; Mon, 13 Nov 2017 23:33:08 GMT (envelope-from jeff@FreeBSD.org) Received: (from jeff@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id vADNX8XN004142; Mon, 13 Nov 2017 23:33:08 GMT (envelope-from jeff@FreeBSD.org) Message-Id: <201711132333.vADNX8XN004142@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: jeff set sender to jeff@FreeBSD.org using -f From: Jeff Roberson <jeff@FreeBSD.org> Date: Mon, 13 Nov 2017 23:33:08 +0000 (UTC) To: src-committers@freebsd.org, svn-src-user@freebsd.org Subject: svn commit: r325784 - in user/jeff/numa/sys: kern vm X-SVN-Group: user X-SVN-Commit-Author: jeff X-SVN-Commit-Paths: in user/jeff/numa/sys: kern vm X-SVN-Commit-Revision: 325784 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-user@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "SVN commit messages for the experimental " user" src tree" <svn-src-user.freebsd.org> List-Unsubscribe: <https://lists.freebsd.org/mailman/options/svn-src-user>, <mailto:svn-src-user-request@freebsd.org?subject=unsubscribe> List-Archive: <http://lists.freebsd.org/pipermail/svn-src-user/> List-Post: <mailto:svn-src-user@freebsd.org> List-Help: <mailto:svn-src-user-request@freebsd.org?subject=help> List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/svn-src-user>, <mailto:svn-src-user-request@freebsd.org?subject=subscribe> X-List-Received-Date: Mon, 13 Nov 2017 23:33:09 -0000 Author: jeff Date: Mon Nov 13 23:33:07 2017 New Revision: 325784 URL: https://svnweb.freebsd.org/changeset/base/325784 Log: Use a soft limit for kmem implemented within uma. Part of r325754 Modified: user/jeff/numa/sys/kern/kern_malloc.c user/jeff/numa/sys/kern/subr_vmem.c user/jeff/numa/sys/vm/uma_core.c user/jeff/numa/sys/vm/uma_int.h Modified: user/jeff/numa/sys/kern/kern_malloc.c ============================================================================== --- user/jeff/numa/sys/kern/kern_malloc.c Mon Nov 13 23:21:17 2017 (r325783) +++ user/jeff/numa/sys/kern/kern_malloc.c Mon Nov 13 23:33:07 2017 (r325784) @@ -237,16 +237,22 @@ sysctl_kmem_map_size(SYSCTL_HANDLER_ARGS) { u_long size; - size = vmem_size(kernel_arena, VMEM_ALLOC); + size = uma_size(); return (sysctl_handle_long(oidp, &size, 0, req)); } static int sysctl_kmem_map_free(SYSCTL_HANDLER_ARGS) { - u_long size; + u_long size, limit; - size = vmem_size(kernel_arena, VMEM_FREE); + /* The sysctl is unsigned, implement as a saturation value. */ + size = uma_size(); + limit = uma_limit(); + if (size > limit) + size = 0; + else + size = limit - size; return (sysctl_handle_long(oidp, &size, 0, req)); } @@ -667,19 +673,6 @@ reallocf(void *addr, unsigned long size, struct malloc return (mem); } -/* - * Wake the uma reclamation pagedaemon thread when we exhaust KVA. It - * will call the lowmem handler and uma_reclaim() callbacks in a - * context that is safe. - */ -static void -kmem_reclaim(vmem_t *vm, int flags) -{ - - uma_reclaim_wakeup(); - pagedaemon_wakeup(); -} - #ifndef __sparc64__ CTASSERT(VM_KMEM_SIZE_SCALE >= 1); #endif @@ -757,8 +750,7 @@ kmeminit(void) #else tmp = vm_kmem_size; #endif - vmem_set_limit(kernel_arena, tmp); - vmem_set_reclaim(kernel_arena, kmem_reclaim); + uma_set_limit(tmp); #ifdef DEBUG_MEMGUARD /* Modified: user/jeff/numa/sys/kern/subr_vmem.c ============================================================================== --- user/jeff/numa/sys/kern/subr_vmem.c Mon Nov 13 23:21:17 2017 (r325783) +++ user/jeff/numa/sys/kern/subr_vmem.c Mon Nov 13 23:33:07 2017 (r325784) @@ -833,9 +833,6 @@ vmem_import(vmem_t *vm, vmem_size_t size, vmem_size_t vmem_addr_t addr; int error; - if (vm->vm_limit != 0 && vm->vm_limit < vm->vm_size + size) - return ENOMEM; - if (vm->vm_importfn == NULL) return EINVAL; @@ -846,6 +843,9 @@ vmem_import(vmem_t *vm, vmem_size_t size, vmem_size_t if (align != vm->vm_quantum_mask + 1) size = (align * 2) + size; size = roundup(size, vm->vm_import_quantum); + + if (vm->vm_limit != 0 && vm->vm_limit < vm->vm_size + size) + return ENOMEM; /* * Hide MAXALLOC tags so we're guaranteed to be able to add this Modified: user/jeff/numa/sys/vm/uma_core.c ============================================================================== --- user/jeff/numa/sys/vm/uma_core.c Mon Nov 13 23:21:17 2017 (r325783) +++ user/jeff/numa/sys/vm/uma_core.c Mon Nov 13 23:33:07 2017 (r325784) @@ -145,6 +145,10 @@ static struct mtx uma_boot_pages_mtx; static struct sx uma_drain_lock; +/* kmem soft limit. */ +static unsigned long uma_kmem_limit; +static volatile unsigned long uma_kmem_total; + /* Is the VM done starting up? */ static int booted = 0; #define UMA_STARTUP 1 @@ -283,6 +287,22 @@ static int zone_warnings = 1; SYSCTL_INT(_vm, OID_AUTO, zone_warnings, CTLFLAG_RWTUN, &zone_warnings, 0, "Warn when UMA zones becomes full"); +/* Adjust bytes under management by UMA. */ +static inline void +uma_total_dec(unsigned long size) +{ + + atomic_subtract_long(&uma_kmem_total, size); +} + +static inline void +uma_total_inc(unsigned long size) +{ + + if (atomic_fetchadd_long(&uma_kmem_total, size) > uma_kmem_limit) + uma_reclaim_wakeup(); +} + /* * This routine checks to see whether or not it's safe to enable buckets. */ @@ -829,6 +849,7 @@ keg_free_slab(uma_keg_t keg, uma_slab_t slab, int star if (keg->uk_flags & UMA_ZONE_OFFPAGE) zone_free_item(keg->uk_slabzone, slab, NULL, SKIP_NONE); keg->uk_freef(mem, PAGE_SIZE * keg->uk_ppera, flags); + uma_total_dec(PAGE_SIZE * keg->uk_ppera); } /* @@ -933,6 +954,7 @@ keg_alloc_slab(uma_keg_t keg, uma_zone_t zone, int wai { uma_alloc allocf; uma_slab_t slab; + unsigned long size; uint8_t *mem; uint8_t flags; int i; @@ -943,6 +965,7 @@ keg_alloc_slab(uma_keg_t keg, uma_zone_t zone, int wai allocf = keg->uk_allocf; KEG_UNLOCK(keg); + size = keg->uk_ppera * PAGE_SIZE; if (keg->uk_flags & UMA_ZONE_OFFPAGE) { slab = zone_alloc_item(keg->uk_slabzone, NULL, wait); @@ -966,13 +989,14 @@ keg_alloc_slab(uma_keg_t keg, uma_zone_t zone, int wai wait |= M_NODUMP; /* zone is passed for legacy reasons. */ - mem = allocf(zone, keg->uk_ppera * PAGE_SIZE, &flags, wait); + mem = allocf(zone, size, &flags, wait); if (mem == NULL) { if (keg->uk_flags & UMA_ZONE_OFFPAGE) zone_free_item(keg->uk_slabzone, slab, NULL, SKIP_NONE); slab = NULL; goto out; } + uma_total_inc(size); /* Point the slab into the allocated memory */ if (!(keg->uk_flags & UMA_ZONE_OFFPAGE)) @@ -3128,14 +3152,14 @@ uma_reclaim(void) sx_xunlock(&uma_drain_lock); } -static int uma_reclaim_needed; +static volatile int uma_reclaim_needed; void uma_reclaim_wakeup(void) { - uma_reclaim_needed = 1; - wakeup(&uma_reclaim_needed); + if (atomic_fetchadd_int(&uma_reclaim_needed, 1) == 0) + wakeup(uma_reclaim); } void @@ -3144,14 +3168,13 @@ uma_reclaim_worker(void *arg __unused) sx_xlock(&uma_drain_lock); for (;;) { - sx_sleep(&uma_reclaim_needed, &uma_drain_lock, PVM, - "umarcl", 0); + sx_sleep(uma_reclaim, &uma_drain_lock, PVM, "umarcl", 0); if (uma_reclaim_needed) { - uma_reclaim_needed = 0; sx_xunlock(&uma_drain_lock); EVENTHANDLER_INVOKE(vm_lowmem, VM_LOW_KMEM); sx_xlock(&uma_drain_lock); uma_reclaim_locked(true); + atomic_set_int(&uma_reclaim_needed, 0); } } } @@ -3215,6 +3238,27 @@ uma_zero_item(void *item, uma_zone_t zone) bzero(zpcpu_get_cpu(item, i), zone->uz_size); } else bzero(item, zone->uz_size); +} + +unsigned long +uma_limit(void) +{ + + return uma_kmem_limit; +} + +void +uma_set_limit(unsigned long limit) +{ + uma_kmem_limit = limit; +} + + +unsigned long +uma_size(void) +{ + + return uma_kmem_total; } void Modified: user/jeff/numa/sys/vm/uma_int.h ============================================================================== --- user/jeff/numa/sys/vm/uma_int.h Mon Nov 13 23:21:17 2017 (r325783) +++ user/jeff/numa/sys/vm/uma_int.h Mon Nov 13 23:33:07 2017 (r325784) @@ -423,6 +423,13 @@ vsetslab(vm_offset_t va, uma_slab_t slab) void *uma_small_alloc(uma_zone_t zone, vm_size_t bytes, uint8_t *pflag, int wait); void uma_small_free(void *mem, vm_size_t size, uint8_t flags); + +/* Set a global soft limit on UMA managed memory. */ +void uma_set_limit(unsigned long limit); +unsigned long uma_limit(void); + +/* Return the amount of memory managed by UMA. */ +unsigned long uma_size(void); #endif /* _KERNEL */ #endif /* VM_UMA_INT_H */ From owner-svn-src-user@freebsd.org Thu Nov 16 10:47:22 2017 Return-Path: <owner-svn-src-user@freebsd.org> Delivered-To: svn-src-user@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 85C43DD8E09 for <svn-src-user@mailman.ysv.freebsd.org>; Thu, 16 Nov 2017 10:47:22 +0000 (UTC) (envelope-from pho@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 50A9975B3C; Thu, 16 Nov 2017 10:47:22 +0000 (UTC) (envelope-from pho@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id vAGAlLDH012855; Thu, 16 Nov 2017 10:47:21 GMT (envelope-from pho@FreeBSD.org) Received: (from pho@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id vAGAlL5H012854; Thu, 16 Nov 2017 10:47:21 GMT (envelope-from pho@FreeBSD.org) Message-Id: <201711161047.vAGAlL5H012854@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: pho set sender to pho@FreeBSD.org using -f From: Peter Holm <pho@FreeBSD.org> Date: Thu, 16 Nov 2017 10:47:21 +0000 (UTC) To: src-committers@freebsd.org, svn-src-user@freebsd.org Subject: svn commit: r325889 - user/pho/stress2/misc X-SVN-Group: user X-SVN-Commit-Author: pho X-SVN-Commit-Paths: user/pho/stress2/misc X-SVN-Commit-Revision: 325889 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-user@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "SVN commit messages for the experimental " user" src tree" <svn-src-user.freebsd.org> List-Unsubscribe: <https://lists.freebsd.org/mailman/options/svn-src-user>, <mailto:svn-src-user-request@freebsd.org?subject=unsubscribe> List-Archive: <http://lists.freebsd.org/pipermail/svn-src-user/> List-Post: <mailto:svn-src-user@freebsd.org> List-Help: <mailto:svn-src-user-request@freebsd.org?subject=help> List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/svn-src-user>, <mailto:svn-src-user-request@freebsd.org?subject=subscribe> X-List-Received-Date: Thu, 16 Nov 2017 10:47:22 -0000 Author: pho Date: Thu Nov 16 10:47:21 2017 New Revision: 325889 URL: https://svnweb.freebsd.org/changeset/base/325889 Log: Fix misunderstandings. Sponsored by: Dell EMC Isilon Modified: user/pho/stress2/misc/stack_guard_page.sh Modified: user/pho/stress2/misc/stack_guard_page.sh ============================================================================== --- user/pho/stress2/misc/stack_guard_page.sh Thu Nov 16 10:15:17 2017 (r325888) +++ user/pho/stress2/misc/stack_guard_page.sh Thu Nov 16 10:47:21 2017 (r325889) @@ -28,9 +28,8 @@ # $FreeBSD$ # -# Setting a negative guard page size will cause "Abort trap" -# Reported by Shawn Webb <shawn.webb@hardenedbsd.org> -# Fixed in r320560. +# Test with stack_guard_page set between 1 and 512. +# A negative value is considered invalid. [ `sysctl -n security.bsd.stack_guard_page` -eq 0 ] && exit 0 @@ -41,7 +40,7 @@ trap "sysctl security.bsd.stack_guard_page=$old" EXIT start=`date +%s` while [ $((`date +%s` - start)) -lt 60 ]; do - sysctl security.bsd.stack_guard_page=`jot -r 1 -1 512` > \ + sysctl security.bsd.stack_guard_page=`jot -r 1 1 512` > \ /dev/null 2>&1 sleep 1 done