Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 9 Jun 2010 23:41:57 +0530
From:      "Jayachandran C." <c.jayachandran@gmail.com>
To:        Alan Cox <alc@cs.rice.edu>
Cc:        Kostik Belousov <kostikbel@gmail.com>, "Jayachandran C." <jchandra@freebsd.org>, mips@freebsd.org
Subject:   Re: svn commit: r208589 - head/sys/mips/mips
Message-ID:  <AANLkTimRND9G9udHWzhN06wTJcRCN-OEjPPXctOoyj9_@mail.gmail.com>
In-Reply-To: <AANLkTimWh77REpi3ZD0BOihZit5qKNYNbtAx5PWQRYBX@mail.gmail.com>
References:  <AANLkTimIa3jmBPMhWIOcY6DenGpZ2ZYmqwDTWspVx0-u@mail.gmail.com> <AANLkTil2gE1niUWCHnsTlQvibhxBh7QYwD0TTWo0rj5c@mail.gmail.com> <AANLkTinA2D5iTDGPbflHVzLyAZW-ZewjJkUWWL8FVskr@mail.gmail.com> <4C07E07B.9060802@cs.rice.edu> <AANLkTimjyPc_AXKP1yaJaF1BN7CAGBeNikVzcp9OCb4P@mail.gmail.com> <4C09345F.9040300@cs.rice.edu> <AANLkTinmFOZY3OlaoKStxlNIRBt2G2I4ILkQ1P0CjozG@mail.gmail.com> <4C0D2BEA.6060103@cs.rice.edu> <AANLkTikZxx_30H9geHvZYkYd0sE-wiuZljEd0PAi14ca@mail.gmail.com> <4C0D3F40.2070101@cs.rice.edu> <20100607202844.GU83316@deviant.kiev.zoral.com.ua> <4C0D64B7.7060604@cs.rice.edu> <AANLkTilBxdXxXrWC1cAT0wX9ubmFrvaAdk4feG6PwDYQ@mail.gmail.com> <4C0DE424.9080601@cs.rice.edu> <AANLkTinzIUOykgwtHlJ2vDwYS9as3ha_BYiy_qRd5h2Q@mail.gmail.com> <AANLkTimWh77REpi3ZD0BOihZit5qKNYNbtAx5PWQRYBX@mail.gmail.com>

index | next in thread | previous in thread | raw e-mail

[-- Attachment #1 --]
On Wed, Jun 9, 2010 at 11:20 AM, Jayachandran C.
<c.jayachandran@gmail.com> wrote:
> On Wed, Jun 9, 2010 at 3:01 AM, Jayachandran C.
> <c.jayachandran@gmail.com> wrote:
>> On Tue, Jun 8, 2010 at 12:03 PM, Alan Cox <alc@cs.rice.edu> wrote:
>>>
>>> VM_FREEPOOL_DIRECT is used by at least amd64 and ia64 for page table pages
>>> and small kernel memory allocations.  Unlike mips, these machines don't have
>>> MMU support for a direct map.  Their direct maps are just a range of
>>> mappings in the regular (kernel) page table.  So, unlike mips, accesses
>>> through their direct map may still miss in the TLB and require a page table
>>> walk.  VM_FREEPOOL_* is a way to increase the physical locality (or
>>> clustering) of page allocations, so that, for example, page table page
>>> accesses by the pmap on amd64 are less likely to miss in the TLB.  However,
>>> it doesn't place a hard restriction on the range of physical addresses that
>>> will be used, which you need for mips.
>>>
>>> The impact of this clustering can be significant.  For example, on amd64 we
>>> use 2MB page mappings to implement the direct map.  However, old Opterons
>>> only had 8 data TLB entries for 2MB page mappings.  For a uniprocessor
>>> kernel running on such an Opteron, I measured an 18% reduction in system
>>> time during a buildworld with the introduction of VM_FREEPOOL_DIRECT.  (See
>>> the commit logs for vm/vm_phys.c and the comment that precedes the
>>> VM_NFREEORDER definition on amd64.)
>>>
>>> Until such time as superpage support is ported to mips from the amd64/i386
>>> pmaps, I don't think there is a point in having more than one VM_FREEPOOL_*
>>> on mips.  And then, the point would be to reduce fragmentation of the
>>> physical memory that could be caused by small allocations, such as page
>>> table pages.
>>
>> Thanks for the detailed explanation.
>>
>> Also, after looking at the code again,  I think vm_phys_alloc_contig()
>> can optimized not to look into segments which lie outside the area of
>> interest. The patch is:
> [...]
>> This change, along with the vmparam.h changes for HIGHMEM, I think we
>> should be able to use  vm_phys_alloc_contig() for page table pages (or
>> have I again missed something fundamental?).
>
> That patch was obviously wrong - many segments can map to a freelist
> as the comment right above my change noted.
>
> But the idea of eliminating freelists for which all the segments are
> outside (low,high) may still be useful, will look at this some more.

I have attached a patch (also at
http://people.freebsd.org/~jchandra/pmap-with-HIGHMEM-freelist.patch)
which reverts most of the changes I did to convert the page table page
allocation to use UMA zone, and replaces it with an implementation
using vm_phys_alloc_contig() and vm_contig_grow_cache(). This creates
a new HIGHMEM freelist for mips for memory outside the KSEG0 area, and
makes a few changes in vm_phys_alloc_contig() to skip freelists for
which all the segments fall outside the address range requested.

With this the buildworld perf on MIPS is similar to what I got with
the older code with zones.

If this approach is okay, I will do another round of
testing(buildworld passes, but I haven't really tested the case where
grow_cache is called).  If the changes are not okay, I will add
another page allocator which takes freelist as argument as you had
suggested earlier, instead of the vm_phys_alloc_contig() changes.

Thanks,
JC.

[-- Attachment #2 --]
Index: sys/mips/include/vmparam.h
===================================================================
--- sys/mips/include/vmparam.h	(revision 208890)
+++ sys/mips/include/vmparam.h	(working copy)
@@ -103,8 +103,9 @@
 #define	VM_MAXUSER_ADDRESS	((vm_offset_t)0x80000000)
 #define	VM_MAX_MMAP_ADDR	VM_MAXUSER_ADDRESS
 
-#define	VM_MIN_KERNEL_ADDRESS		((vm_offset_t)0xC0000000)
-#define	VM_MAX_KERNEL_ADDRESS		((vm_offset_t)0xFFFFC000)
+#define	VM_MIN_KERNEL_ADDRESS	((vm_offset_t)0xC0000000)
+#define	VM_MAX_KERNEL_ADDRESS	((vm_offset_t)0xFFFFC000)
+#define	VM_HIGHMEM_ADDRESS	((vm_paddr_t)0x20000000)
 #if 0
 #define	KERNBASE		(VM_MIN_KERNEL_ADDRESS)
 #else
@@ -168,13 +169,15 @@
 #define	VM_FREEPOOL_DIRECT	1
 
 /*
- * we support 1 free list:
+ * we support 2 free lists:
  *
- *	- DEFAULT for all systems
+ *	- DEFAULT for direct mapped (KSEG0) pages
+ *	- HIGHMEM for other pages 
  */
 
-#define	VM_NFREELIST		1
-#define	VM_FREELIST_DEFAULT	0
+#define	VM_NFREELIST		2
+#define	VM_FREELIST_DEFAULT	1
+#define	VM_FREELIST_HIGHMEM	0
 
 /*
  * The largest allocation size is 1MB.
Index: sys/mips/mips/pmap.c
===================================================================
--- sys/mips/mips/pmap.c	(revision 208890)
+++ sys/mips/mips/pmap.c	(working copy)
@@ -184,8 +184,6 @@
 static int init_pte_prot(vm_offset_t va, vm_page_t m, vm_prot_t prot);
 static void pmap_TLB_invalidate_kernel(vm_offset_t);
 static void pmap_TLB_update_kernel(vm_offset_t, pt_entry_t);
-static vm_page_t pmap_alloc_pte_page(pmap_t, unsigned int, int, vm_offset_t *);
-static void pmap_release_pte_page(vm_page_t);
 
 #ifdef SMP
 static void pmap_invalidate_page_action(void *arg);
@@ -193,10 +191,6 @@
 static void pmap_update_page_action(void *arg);
 #endif
 
-static void pmap_ptpgzone_dtor(void *mem, int size, void *arg);
-static void *pmap_ptpgzone_allocf(uma_zone_t, int, u_int8_t *, int);
-static uma_zone_t ptpgzone;
-
 struct local_sysmaps {
 	struct mtx lock;
 	vm_offset_t base;
@@ -329,7 +323,7 @@
 }
 
 /*
- *	Bootstrap the system enough to run with virtual memory.  This
+ * Bootstrap the system enough to run with virtual memory.  This
  * assumes that the phys_avail array has been initialized.
  */
 void
@@ -535,10 +529,6 @@
 	pv_entry_max = PMAP_SHPGPERPROC * maxproc + cnt.v_page_count;
 	pv_entry_high_water = 9 * (pv_entry_max / 10);
 	uma_zone_set_obj(pvzone, &pvzone_obj, pv_entry_max);
-
-	ptpgzone = uma_zcreate("PT ENTRY", PAGE_SIZE, NULL, pmap_ptpgzone_dtor,
-	    NULL, NULL, PAGE_SIZE - 1, UMA_ZONE_NOFREE | UMA_ZONE_ZINIT);
-	uma_zone_set_allocf(ptpgzone, pmap_ptpgzone_allocf);
 }
 
 /***************************************************
@@ -885,12 +875,8 @@
 	/*
 	 * If the page is finally unwired, simply free it.
 	 */
+	vm_page_free_zero(m);
 	atomic_subtract_int(&cnt.v_wire_count, 1);
-	PMAP_UNLOCK(pmap);
-	vm_page_unlock_queues();
-	pmap_release_pte_page(m);
-	vm_page_lock_queues();
-	PMAP_LOCK(pmap);
 	return (1);
 }
 
@@ -949,96 +935,36 @@
 	bzero(&pmap->pm_stats, sizeof pmap->pm_stats);
 }
 
+
 static void
-pmap_ptpgzone_dtor(void *mem, int size, void *arg)
+pmap_grow_page_cache(int wait)
 {
-#ifdef INVARIANTS
-	static char zeropage[PAGE_SIZE];
-
-	KASSERT(size == PAGE_SIZE,
-		("pmap_ptpgzone_dtor: invalid size %d", size));
-	KASSERT(bcmp(mem, zeropage, PAGE_SIZE) == 0,
-		("pmap_ptpgzone_dtor: freeing a non-zeroed page"));
-#endif
+	printf("[%s] wait %x\n", __func__, wait);
+	vm_contig_grow_cache(3, 0, MIPS_KSEG0_LARGEST_PHYS);
 }
 
-static void *
-pmap_ptpgzone_allocf(uma_zone_t zone, int bytes, u_int8_t *flags, int wait)
-{
-	vm_page_t m;
-	vm_paddr_t paddr;
-	int tries;
-	
-	KASSERT(bytes == PAGE_SIZE,
-		("pmap_ptpgzone_allocf: invalid allocation size %d", bytes));
 
-	*flags = UMA_SLAB_PRIV;
-	tries = 0;
-retry:
-	m = vm_phys_alloc_contig(1, 0, MIPS_KSEG0_LARGEST_PHYS,
-	    PAGE_SIZE, PAGE_SIZE);
-	if (m == NULL) {
-                if (tries < ((wait & M_NOWAIT) != 0 ? 1 : 3)) {
-			vm_contig_grow_cache(tries, 0, MIPS_KSEG0_LARGEST_PHYS);
-			tries++;
-			goto retry;
-		} else
-			return (NULL);
-	}
-
-	paddr = VM_PAGE_TO_PHYS(m);
-	return ((void *)MIPS_PHYS_TO_KSEG0(paddr));
-}	
-
 static vm_page_t
-pmap_alloc_pte_page(pmap_t pmap, unsigned int index, int wait, vm_offset_t *vap)
+pmap_alloc_pte_page(unsigned int index, int wait)
 {
-	vm_paddr_t paddr;
-	void *va;
 	vm_page_t m;
-	int locked;
 
-	locked = mtx_owned(&pmap->pm_mtx);
-	if (locked) {
-		mtx_assert(&vm_page_queue_mtx, MA_OWNED);
-		PMAP_UNLOCK(pmap);
-		vm_page_unlock_queues();
-	}
-	va = uma_zalloc(ptpgzone, wait);
-	if (locked) {
-		vm_page_lock_queues();
-		PMAP_LOCK(pmap);
-	}
-	if (va == NULL)
+	m = vm_phys_alloc_contig(1, 0, MIPS_KSEG0_LARGEST_PHYS,
+	    PAGE_SIZE, 0);
+	if (m == NULL)
 		return (NULL);
 
-	paddr = MIPS_KSEG0_TO_PHYS(va);
-	m = PHYS_TO_VM_PAGE(paddr);
-	
-	if (!locked)
-		vm_page_lock_queues();
+	if ((m->flags & PG_ZERO) == 0)
+		pmap_zero_page(m);
+
 	m->pindex = index;
 	m->valid = VM_PAGE_BITS_ALL;
-	m->wire_count = 1;
-	if (!locked)
-		vm_page_unlock_queues();
-
 	atomic_add_int(&cnt.v_wire_count, 1);
-	*vap = (vm_offset_t)va;
+	m->wire_count = 1;
 	return (m);
 }
 
-static void
-pmap_release_pte_page(vm_page_t m)
-{
-	void *va;
-	vm_paddr_t paddr;
 
-	paddr = VM_PAGE_TO_PHYS(m);
-	va = (void *)MIPS_PHYS_TO_KSEG0(paddr);
-	uma_zfree(ptpgzone, va);
-}
-
 /*
  * Initialize a preallocated and zeroed pmap structure,
  * such as one in a vmspace structure.
@@ -1055,10 +981,10 @@
 	/*
 	 * allocate the page directory page
 	 */
-	ptdpg = pmap_alloc_pte_page(pmap, NUSERPGTBLS, M_WAITOK, &ptdva);
-	if (ptdpg == NULL)
-		return (0);
+	while ((ptdpg = pmap_alloc_pte_page(NUSERPGTBLS, M_WAITOK)) == NULL)
+	       pmap_grow_page_cache(M_WAITOK);
 
+	ptdva = MIPS_PHYS_TO_KSEG0(VM_PAGE_TO_PHYS(ptdpg));
 	pmap->pm_segtab = (pd_entry_t *)ptdva;
 	pmap->pm_active = 0;
 	pmap->pm_ptphint = NULL;
@@ -1089,15 +1015,28 @@
 	/*
 	 * Find or fabricate a new pagetable page
 	 */
-	m = pmap_alloc_pte_page(pmap, ptepindex, flags, &pteva);
-	if (m == NULL)
+	if ((m = pmap_alloc_pte_page(ptepindex, flags)) == NULL) {
+		if (flags & M_WAITOK) {
+			PMAP_UNLOCK(pmap);
+			vm_page_unlock_queues();
+			pmap_grow_page_cache(flags);
+			vm_page_lock_queues();
+			PMAP_LOCK(pmap);
+		}
+
+		/*
+		 * Indicate the need to retry.	While waiting, the page
+		 * table page may have been allocated.
+		 */
 		return (NULL);
+	}
 
 	/*
 	 * Map the pagetable page into the process address space, if it
 	 * isn't already there.
 	 */
 
+	pteva = MIPS_PHYS_TO_KSEG0(VM_PAGE_TO_PHYS(m));
 	pmap->pm_stats.resident_count++;
 	pmap->pm_segtab[ptepindex] = (pd_entry_t)pteva;
 
@@ -1193,7 +1132,7 @@
 
 	ptdpg->wire_count--;
 	atomic_subtract_int(&cnt.v_wire_count, 1);
-	pmap_release_pte_page(ptdpg);
+	vm_page_free_zero(ptdpg);
 	PMAP_LOCK_DESTROY(pmap);
 }
 
@@ -1203,7 +1142,6 @@
 void
 pmap_growkernel(vm_offset_t addr)
 {
-	vm_offset_t pageva;
 	vm_page_t nkpg;
 	pt_entry_t *pte;
 	int i;
@@ -1238,13 +1176,13 @@
 		/*
 		 * This index is bogus, but out of the way
 		 */
-		nkpg = pmap_alloc_pte_page(kernel_pmap, nkpt, M_NOWAIT, &pageva);
+		nkpg = pmap_alloc_pte_page(nkpt, M_NOWAIT);
 
 		if (!nkpg)
 			panic("pmap_growkernel: no memory to grow kernel");
 
 		nkpt++;
-		pte = (pt_entry_t *)pageva;
+		pte = (pt_entry_t *)MIPS_PHYS_TO_KSEG0(VM_PAGE_TO_PHYS(nkpg));
 		segtab_pde(kernel_segmap, kernel_vm_end) = (pd_entry_t)pte;
 
 		/*
Index: sys/vm/vm_phys.c
===================================================================
--- sys/vm/vm_phys.c	(revision 208890)
+++ sys/vm/vm_phys.c	(working copy)
@@ -66,6 +66,7 @@
 	vm_paddr_t	end;
 	vm_page_t	first_page;
 	struct vm_freelist (*free_queues)[VM_NFREEPOOL][VM_NFREEORDER];
+	int		flind;
 };
 
 static struct vm_phys_seg vm_phys_segs[VM_PHYSSEG_MAX];
@@ -188,6 +189,7 @@
 	seg = &vm_phys_segs[vm_phys_nsegs++];
 	seg->start = start;
 	seg->end = end;
+	seg->flind = flind;
 #ifdef VM_PHYSSEG_SPARSE
 	seg->first_page = &vm_page_array[pages];
 #else
@@ -595,7 +597,8 @@
 	vm_object_t m_object;
 	vm_paddr_t pa, pa_last, size;
 	vm_page_t deferred_vdrop_list, m, m_ret;
-	int flind, i, oind, order, pind;
+	int flind, i, oind, order, pind, segind;
+	int fl_good[VM_NFREELIST];
 
 	size = npages << PAGE_SHIFT;
 	KASSERT(size != 0,
@@ -607,11 +610,22 @@
 	deferred_vdrop_list = NULL;
 	/* Compute the queue that is the best fit for npages. */
 	for (order = 0; (1 << order) < npages; order++);
+	/* Compute the freelists that should be searched */
+	for (flind = 0; flind < vm_nfreelists; flind++)
+		fl_good[flind] = 0;
+	for (segind = 0; segind < vm_phys_nsegs; segind++) {
+		seg = &vm_phys_segs[segind];
+		if (low < seg->end && high > seg->start)
+			fl_good[seg->flind]++;
+	}
 	mtx_lock(&vm_page_queue_free_mtx);
 #if VM_NRESERVLEVEL > 0
 retry:
 #endif
 	for (flind = 0; flind < vm_nfreelists; flind++) {
+		if (fl_good[flind] == 0)
+			continue;
+
 		for (oind = min(order, VM_NFREEORDER - 1); oind < VM_NFREEORDER; oind++) {
 			for (pind = 0; pind < VM_NFREEPOOL; pind++) {
 				fl = vm_phys_free_queues[flind][pind];
home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTimRND9G9udHWzhN06wTJcRCN-OEjPPXctOoyj9_>