From owner-svn-src-head@freebsd.org Wed May 25 23:06:53 2016 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AA951B4A0C1; Wed, 25 May 2016 23:06:53 +0000 (UTC) (envelope-from jkim@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 69A011C3A; Wed, 25 May 2016 23:06:53 +0000 (UTC) (envelope-from jkim@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id u4PN6qsY033021; Wed, 25 May 2016 23:06:52 GMT (envelope-from jkim@FreeBSD.org) Received: (from jkim@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id u4PN6qAE033020; Wed, 25 May 2016 23:06:52 GMT (envelope-from jkim@FreeBSD.org) Message-Id: <201605252306.u4PN6qAE033020@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: jkim set sender to jkim@FreeBSD.org using -f From: Jung-uk Kim Date: Wed, 25 May 2016 23:06:52 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r300700 - head/sys/amd64/amd64 X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 May 2016 23:06:53 -0000 Author: jkim Date: Wed May 25 23:06:52 2016 New Revision: 300700 URL: https://svnweb.freebsd.org/changeset/base/300700 Log: Both Clang and GCC cannot generate efficient reserve_pv_entries(). http://docs.freebsd.org/cgi/mid.cgi?552BFEB2.8040407 Re-implement it entirely in inline assembly not to let compilers do silly spilling to memory. For non-POPCNT case, use newly added bit_count(3). Reported by: alc Reviewed by: alc, kib Differential Revision: https://reviews.freebsd.org/D6541 Modified: head/sys/amd64/amd64/pmap.c Modified: head/sys/amd64/amd64/pmap.c ============================================================================== --- head/sys/amd64/amd64/pmap.c Wed May 25 22:16:11 2016 (r300699) +++ head/sys/amd64/amd64/pmap.c Wed May 25 23:06:52 2016 (r300700) @@ -104,6 +104,7 @@ __FBSDID("$FreeBSD$"); #include "opt_vm.h" #include +#include #include #include #include @@ -585,7 +586,7 @@ static caddr_t crashdumpmap; static void free_pv_chunk(struct pv_chunk *pc); static void free_pv_entry(pmap_t pmap, pv_entry_t pv); static pv_entry_t get_pv_entry(pmap_t pmap, struct rwlock **lockp); -static int popcnt_pc_map_elem_pq(uint64_t elem); +static int popcnt_pc_map_pq(uint64_t *map); static vm_page_t reclaim_pv_chunk(pmap_t locked_pmap, struct rwlock **lockp); static void reserve_pv_entries(pmap_t pmap, int needed, struct rwlock **lockp); @@ -3126,7 +3127,7 @@ retry: } /* - * Returns the number of one bits within the given PV chunk map element. + * Returns the number of one bits within the given PV chunk map. * * The erratas for Intel processors state that "POPCNT Instruction May * Take Longer to Execute Than Expected". It is believed that the @@ -3142,12 +3143,15 @@ retry: * 6th Gen Core: SKL029 */ static int -popcnt_pc_map_elem_pq(uint64_t elem) +popcnt_pc_map_pq(uint64_t *map) { - u_long result; + u_long result, tmp; - __asm __volatile("xorl %k0,%k0;popcntq %1,%0" - : "=&r" (result) : "rm" (elem)); + __asm __volatile("xorl %k0,%k0;popcntq %2,%0;" + "xorl %k1,%k1;popcntq %3,%1;addl %k1,%k0;" + "xorl %k1,%k1;popcntq %4,%1;addl %k1,%k0" + : "=&r" (result), "=&r" (tmp) + : "m" (map[0]), "m" (map[1]), "m" (map[2])); return (result); } @@ -3179,17 +3183,12 @@ retry: avail = 0; TAILQ_FOREACH(pc, &pmap->pm_pvchunk, pc_list) { #ifndef __POPCNT__ - if ((cpu_feature2 & CPUID2_POPCNT) == 0) { - free = bitcount64(pc->pc_map[0]); - free += bitcount64(pc->pc_map[1]); - free += bitcount64(pc->pc_map[2]); - } else + if ((cpu_feature2 & CPUID2_POPCNT) == 0) + bit_count((bitstr_t *)pc->pc_map, 0, + sizeof(pc->pc_map) * NBBY, &free); + else #endif - { - free = popcnt_pc_map_elem_pq(pc->pc_map[0]); - free += popcnt_pc_map_elem_pq(pc->pc_map[1]); - free += popcnt_pc_map_elem_pq(pc->pc_map[2]); - } + free = popcnt_pc_map_pq(pc->pc_map); if (free == 0) break; avail += free;