Date: Wed, 7 Sep 2016 16:22:05 +0000 (UTC) From: Andrew Turner <andrew@FreeBSD.org> To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r305545 - head/sys/arm64/arm64 Message-ID: <201609071622.u87GM5UZ064244@repo.freebsd.org>
next in thread | raw e-mail | index | archive | help
Author: andrew Date: Wed Sep 7 16:22:05 2016 New Revision: 305545 URL: https://svnweb.freebsd.org/changeset/base/305545 Log: Only call cpu_icache_sync_range when inserting an executable page. If the page is non-executable the contents of the i-cache are unimportant so this call is just adding unneeded overhead when inserting pages. While doing research using gem5 with an O3 pipeline and 1k/32k/1M iTLB/L1 iCache/L2 Bjoern Zeeb (bz@) observed a fairly high rate of calls into arm64_icache_sync_range() from pmap_enter() along with a high number of instruction fetches and iTLB/iCache hits. Limiting the calls to arm64_icache_sync_range() to only executable pages, we observe the iTLB and iCache Hit going down by about 43%. These numbers are quite misleading when looked at alone as at the same time instructions retired were reduced by 19.2% and instruction fetches were reduced by 38.8%. Overall this reduced the runtime of the test program by 22.4%. On Juno hardware, in steady-state, running the same test, using the cycle count to determine runtime, we do see a reduction of up to 28.9% in runtime. While these numbers certainly depend on the program executed, we expect an overall performance improvement. Reported by: bz Obtained from: ABT Systems Ltd MFC after: 1 week Sponsored by: The FreeBSD Foundation Modified: head/sys/arm64/arm64/pmap.c Modified: head/sys/arm64/arm64/pmap.c ============================================================================== --- head/sys/arm64/arm64/pmap.c Wed Sep 7 16:19:20 2016 (r305544) +++ head/sys/arm64/arm64/pmap.c Wed Sep 7 16:22:05 2016 (r305545) @@ -2939,8 +2939,9 @@ validate: pmap_invalidate_page(pmap, va); if (pmap != pmap_kernel()) { - if (pmap == &curproc->p_vmspace->vm_pmap) - cpu_icache_sync_range(va, PAGE_SIZE); + if (pmap == &curproc->p_vmspace->vm_pmap && + (prot & VM_PROT_EXECUTE) != 0) + cpu_icache_sync_range(va, PAGE_SIZE); if ((mpte == NULL || mpte->wire_count == NL3PG) && pmap_superpages_enabled() &&
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201609071622.u87GM5UZ064244>