From owner-svn-src-stable@freebsd.org  Sat Mar 18 05:48:27 2017
Return-Path: <owner-svn-src-stable@freebsd.org>
Delivered-To: svn-src-stable@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 80B51D11AB0;
 Sat, 18 Mar 2017 05:48:27 +0000 (UTC) (envelope-from alc@FreeBSD.org)
Received: from repo.freebsd.org (repo.freebsd.org
 [IPv6:2610:1c1:1:6068::e6a:0])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 3F5571174;
 Sat, 18 Mar 2017 05:48:27 +0000 (UTC) (envelope-from alc@FreeBSD.org)
Received: from repo.freebsd.org ([127.0.1.37])
 by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v2I5mQnq055623;
 Sat, 18 Mar 2017 05:48:26 GMT (envelope-from alc@FreeBSD.org)
Received: (from alc@localhost)
 by repo.freebsd.org (8.15.2/8.15.2/Submit) id v2I5mQFH055622;
 Sat, 18 Mar 2017 05:48:26 GMT (envelope-from alc@FreeBSD.org)
Message-Id: <201703180548.v2I5mQFH055622@repo.freebsd.org>
X-Authentication-Warning: repo.freebsd.org: alc set sender to alc@FreeBSD.org
 using -f
From: Alan Cox <alc@FreeBSD.org>
Date: Sat, 18 Mar 2017 05:48:26 +0000 (UTC)
To: src-committers@freebsd.org, svn-src-all@freebsd.org,
 svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org
Subject: svn commit: r315474 - stable/11/sys/vm
X-SVN-Group: stable-11
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-BeenThere: svn-src-stable@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: SVN commit messages for all the -stable branches of the src tree
 <svn-src-stable.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/svn-src-stable>, 
 <mailto:svn-src-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-stable/>
List-Post: <mailto:svn-src-stable@freebsd.org>
List-Help: <mailto:svn-src-stable-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/svn-src-stable>,
 <mailto:svn-src-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Mar 2017 05:48:27 -0000

Author: alc
Date: Sat Mar 18 05:48:26 2017
New Revision: 315474
URL: https://svnweb.freebsd.org/changeset/base/315474

Log:
  MFC r313186
    Over the years, the code and comments in vm_page_startup() have diverged
    in one respect.  When determining how many page structures to allocate,
    contrary to what the comments say, the code does not account for the
    overhead of a page structure per page of physical memory.  This revision
    changes the code to match the comments.

Modified:
  stable/11/sys/vm/vm_page.c
Directory Properties:
  stable/11/   (props changed)

Modified: stable/11/sys/vm/vm_page.c
==============================================================================
--- stable/11/sys/vm/vm_page.c	Sat Mar 18 05:38:10 2017	(r315473)
+++ stable/11/sys/vm/vm_page.c	Sat Mar 18 05:48:26 2017	(r315474)
@@ -406,17 +406,16 @@ vm_page_domain_init(struct vm_domain *vm
 /*
  *	vm_page_startup:
  *
- *	Initializes the resident memory module.
- *
- *	Allocates memory for the page cells, and
- *	for the object/offset-to-page hash table headers.
- *	Each page cell is initialized and placed on the free list.
+ *	Initializes the resident memory module.  Allocates physical memory for
+ *	bootstrapping UMA and some data structures that are used to manage
+ *	physical pages.  Initializes these structures, and populates the free
+ *	page queues.
  */
 vm_offset_t
 vm_page_startup(vm_offset_t vaddr)
 {
 	vm_offset_t mapped;
-	vm_paddr_t page_range;
+	vm_paddr_t high_avail, low_avail, page_range, size;
 	vm_paddr_t new_end;
 	int i;
 	vm_paddr_t pa;
@@ -424,7 +423,6 @@ vm_page_startup(vm_offset_t vaddr)
 	char *list, *listend;
 	vm_paddr_t end;
 	vm_paddr_t biggestsize;
-	vm_paddr_t low_water, high_water;
 	int biggestone;
 	int pages_per_zone;
 
@@ -436,27 +434,12 @@ vm_page_startup(vm_offset_t vaddr)
 		phys_avail[i] = round_page(phys_avail[i]);
 		phys_avail[i + 1] = trunc_page(phys_avail[i + 1]);
 	}
-
-	low_water = phys_avail[0];
-	high_water = phys_avail[1];
-
-	for (i = 0; i < vm_phys_nsegs; i++) {
-		if (vm_phys_segs[i].start < low_water)
-			low_water = vm_phys_segs[i].start;
-		if (vm_phys_segs[i].end > high_water)
-			high_water = vm_phys_segs[i].end;
-	}
 	for (i = 0; phys_avail[i + 1]; i += 2) {
-		vm_paddr_t size = phys_avail[i + 1] - phys_avail[i];
-
+		size = phys_avail[i + 1] - phys_avail[i];
 		if (size > biggestsize) {
 			biggestone = i;
 			biggestsize = size;
 		}
-		if (phys_avail[i] < low_water)
-			low_water = phys_avail[i];
-		if (phys_avail[i + 1] > high_water)
-			high_water = phys_avail[i + 1];
 	}
 
 	end = phys_avail[biggestone+1];
@@ -471,7 +454,7 @@ vm_page_startup(vm_offset_t vaddr)
 		vm_page_domain_init(&vm_dom[i]);
 
 	/*
-	 * Almost all of the pages needed for boot strapping UMA are used
+	 * Almost all of the pages needed for bootstrapping UMA are used
 	 * for zone structures, so if the number of CPUs results in those
 	 * structures taking more than one page each, we set aside more pages
 	 * in proportion to the zone structure size.
@@ -522,6 +505,16 @@ vm_page_startup(vm_offset_t vaddr)
 	    new_end + vm_page_dump_size, VM_PROT_READ | VM_PROT_WRITE);
 	bzero((void *)vm_page_dump, vm_page_dump_size);
 #endif
+#if defined(__aarch64__) || defined(__amd64__) || defined(__mips__)
+	/*
+	 * Include the UMA bootstrap pages and vm_page_dump in a crash dump.
+	 * When pmap_map() uses the direct map, they are not automatically 
+	 * included.
+	 */
+	for (pa = new_end; pa < end; pa += PAGE_SIZE)
+		dump_add_page(pa);
+#endif
+	phys_avail[biggestone + 1] = new_end;
 #ifdef __amd64__
 	/*
 	 * Request that the physical pages underlying the message buffer be
@@ -537,20 +530,48 @@ vm_page_startup(vm_offset_t vaddr)
 #endif
 	/*
 	 * Compute the number of pages of memory that will be available for
-	 * use (taking into account the overhead of a page structure per
-	 * page).
+	 * use, taking into account the overhead of a page structure per page.
+	 * In other words, solve
+	 *	"available physical memory" - round_page(page_range *
+	 *	    sizeof(struct vm_page)) = page_range * PAGE_SIZE 
+	 * for page_range.  
 	 */
-	first_page = low_water / PAGE_SIZE;
-#ifdef VM_PHYSSEG_SPARSE
-	page_range = 0;
+	low_avail = phys_avail[0];
+	high_avail = phys_avail[1];
 	for (i = 0; i < vm_phys_nsegs; i++) {
-		page_range += atop(vm_phys_segs[i].end -
-		    vm_phys_segs[i].start);
+		if (vm_phys_segs[i].start < low_avail)
+			low_avail = vm_phys_segs[i].start;
+		if (vm_phys_segs[i].end > high_avail)
+			high_avail = vm_phys_segs[i].end;
+	}
+	/* Skip the first chunk.  It is already accounted for. */
+	for (i = 2; phys_avail[i + 1] != 0; i += 2) {
+		if (phys_avail[i] < low_avail)
+			low_avail = phys_avail[i];
+		if (phys_avail[i + 1] > high_avail)
+			high_avail = phys_avail[i + 1];
 	}
+	first_page = low_avail / PAGE_SIZE;
+#ifdef VM_PHYSSEG_SPARSE
+	size = 0;
+	for (i = 0; i < vm_phys_nsegs; i++)
+		size += vm_phys_segs[i].end - vm_phys_segs[i].start;
 	for (i = 0; phys_avail[i + 1] != 0; i += 2)
-		page_range += atop(phys_avail[i + 1] - phys_avail[i]);
+		size += phys_avail[i + 1] - phys_avail[i];
+	page_range = size / (PAGE_SIZE + sizeof(struct vm_page));
 #elif defined(VM_PHYSSEG_DENSE)
-	page_range = high_water / PAGE_SIZE - first_page;
+	/*
+	 * In the VM_PHYSSEG_DENSE case, the number of pages can account for
+	 * the overhead of a page structure per page only if vm_page_array is
+	 * allocated from the last physical memory chunk.  Otherwise, we must
+	 * allocate page structures representing the physical memory
+	 * underlying vm_page_array, even though they will not be used.
+	 */
+	if (new_end == high_avail)
+		page_range = (high_avail - low_avail) / (PAGE_SIZE +
+		    sizeof(struct vm_page));
+	else
+		page_range = high_avail / PAGE_SIZE - first_page;
 #else
 #error "Either VM_PHYSSEG_DENSE or VM_PHYSSEG_SPARSE must be defined."
 #endif
@@ -558,12 +579,13 @@ vm_page_startup(vm_offset_t vaddr)
 
 	/*
 	 * Reserve an unmapped guard page to trap access to vm_page_array[-1].
+	 * However, because this page is allocated from KVM, out-of-bounds
+	 * accesses using the direct map will not be trapped.
 	 */
 	vaddr += PAGE_SIZE;
 
 	/*
-	 * Initialize the mem entry structures now, and put them in the free
-	 * queue.
+	 * Allocate physical memory for the page structures, and map it.
 	 */
 	new_end = trunc_page(end - page_range * sizeof(struct vm_page));
 	mapped = pmap_map(&vaddr, new_end, end,
@@ -571,19 +593,18 @@ vm_page_startup(vm_offset_t vaddr)
 	vm_page_array = (vm_page_t) mapped;
 #if VM_NRESERVLEVEL > 0
 	/*
-	 * Allocate memory for the reservation management system's data
-	 * structures.
+	 * Allocate physical memory for the reservation management system's
+	 * data structures, and map it.
 	 */
-	new_end = vm_reserv_startup(&vaddr, new_end, high_water);
+	if (high_avail == end)
+		high_avail = new_end;
+	new_end = vm_reserv_startup(&vaddr, new_end, high_avail);
 #endif
 #if defined(__aarch64__) || defined(__amd64__) || defined(__mips__)
 	/*
-	 * pmap_map on arm64, amd64, and mips can come out of the direct-map,
-	 * not kvm like i386, so the pages must be tracked for a crashdump to
-	 * include this data.  This includes the vm_page_array and the early
-	 * UMA bootstrap pages.
+	 * Include vm_page_array and vm_reserv_array in a crash dump.
 	 */
-	for (pa = new_end; pa < phys_avail[biggestone + 1]; pa += PAGE_SIZE)
+	for (pa = new_end; pa < end; pa += PAGE_SIZE)
 		dump_add_page(pa);
 #endif
 	phys_avail[biggestone + 1] = new_end;