Date: Fri, 23 Oct 2009 10:22:47 -0500 (CDT) From: Mark Tinguely <tinguely@casselton.net> To: freebsd-arm@freebsd.org, ray@dlink.ua, tinguely@casselton.net Subject: Re: [ARM+NFS] panic while copying across NFS Message-ID: <200910231522.n9NFMlE3002301@casselton.net> In-Reply-To: <20091023155825.381728f4.ray@dlink.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
> Hi Mark! > With your patch works fine. > > # dd if=/swap.file of=/mnt/swap.file bs=1M > 1024+0 records in > 1024+0 records out > 1073741824 bytes transferred in 231.294150 secs (4642322 bytes/sec) > > But still slow. Maybe someone know why slow? (Marvell 88F5182 rev A2) Here is what I think is the complete update to the revisions 181296 and 195779 cache fixes. 1) vm_machdep.c: remove the dangling allocations so they do not un-necessarily turn off the cache in the future. (this is the patch that worked for you. 2-3 are two more) 2) busdma_machdep.c: remove the same amount than shadow mapped. 3) pmap.c: PVF_REF is used to invalidate cache and flush tlb. PVF_REF is set by a trap when the page is really use. kernel pages should assume it is immediately used. In ARMv5 pmap, we should manage every RAM physical page. Without a profiling the kernel, it would be tough to say were performance issues are orginating. (device driver, in the fs code, or machine level). Ideas about the machine level code: I think freeing the memory from the level page table descriptors for general use should improve things. More usuable RAM is always a good thing. There is some code in trap and other places that looks to see if the level 1 pde is for this memory space or shared memory space. we can keep a few level pde around for forks. downside a fork could fail the 16K contig buffer; which it can in other archs too. This is a pretty big change. There are tests/fixes (switch/pmap) for low vector page that can be removed with define statement for high vector kernels. In fact if we are not sharing the level 1 pd, this set only in pmap initialization. Simple change "#ifdef LOW_VECTOR", minor savings. Are we cleaning caches too much? ARMv6/7 will be a big game changer. Should put a ton of effort into ARMv5, put the effort into optimizing, or do both? Index: arm/arm/vm_machdep.c =================================================================== --- arm/arm/vm_machdep.c (revision 198246) +++ arm/arm/vm_machdep.c (working copy) @@ -169,6 +169,9 @@ sf_buf_free(struct sf_buf *sf) if (sf->ref_count == 0) { TAILQ_INSERT_TAIL(&sf_buf_freelist, sf, free_entry); nsfbufsused--; + pmap_kremove(sf->kva); + sf->m = NULL; + LIST_REMOVE(sf, list_entry); if (sf_buf_alloc_want > 0) wakeup_one(&sf_buf_freelist); } @@ -449,9 +452,12 @@ arm_unmap_nocache(void *addr, vm_size_t size) size = round_page(size); i = (raddr - arm_nocache_startaddr) / (PAGE_SIZE); - for (; size > 0; size -= PAGE_SIZE, i++) + for (; size > 0; size -= PAGE_SIZE, i++) { arm_nocache_allocated[i / BITS_PER_INT] &= ~(1 << (i % BITS_PER_INT)); + pmap_kremove(raddr); + raddr += PAGE_SIZE; + } } #ifdef ARM_USE_SMALL_ALLOC Index: arm/arm/busdma_machdep.c =================================================================== --- arm/arm/busdma_machdep.c (revision 198246) +++ arm/arm/busdma_machdep.c (working copy) @@ -649,7 +649,8 @@ bus_dmamem_free(bus_dma_tag_t dmat, void *vaddr, b KASSERT(map->allocbuffer == vaddr, ("Trying to freeing the wrong DMA buffer")); vaddr = map->origbuffer; - arm_unmap_nocache(map->allocbuffer, dmat->maxsize); + arm_unmap_nocache(map->allocbuffer, + dmat->maxsize + ((vm_offset_t)vaddr & PAGE_MASK)); } if (dmat->maxsize <= PAGE_SIZE && dmat->alignment < dmat->maxsize && Index: arm/arm/pmap.c =================================================================== --- arm/arm/pmap.c (revision 198246) +++ arm/arm/pmap.c (working copy) @@ -1643,7 +1643,7 @@ pmap_enter_pv(struct vm_page *pg, struct pv_entry /* PMAP_ASSERT_LOCKED(pmap_kernel()); */ pve->pv_pmap = pmap_kernel(); pve->pv_va = pg->md.pv_kva; - pve->pv_flags = PVF_WRITE | PVF_UNMAN; + pve->pv_flags = PVF_WRITE | PVF_UNMAN | PVF_REF; pg->md.pv_kva = 0; TAILQ_INSERT_HEAD(&pg->md.pv_list, pve, pv_list); @@ -2870,7 +2870,7 @@ pmap_kenter_internal(vm_offset_t va, vm_offset_t p vm_page_lock_queues(); PMAP_LOCK(pmap_kernel()); pmap_enter_pv(m, pve, pmap_kernel(), va, - PVF_WRITE | PVF_UNMAN); + PVF_WRITE | PVF_UNMAN | PVF_REF); pmap_fix_cache(m, pmap_kernel(), va); PMAP_UNLOCK(pmap_kernel()); } else { @@ -3538,7 +3538,7 @@ do_l2b_alloc: if (!TAILQ_EMPTY(&m->md.pv_list) || m->md.pv_kva) { KASSERT(pve != NULL, ("No pv")); - nflags |= PVF_UNMAN; + nflags |= PVF_UNMAN | PVF_REF; pmap_enter_pv(m, pve, pmap, va, nflags); } else m->md.pv_kva = va;
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200910231522.n9NFMlE3002301>