Date: Fri, 20 Mar 2015 15:02:16 +0200 From: Konstantin Belousov <kostikbel@gmail.com> To: John Baldwin <jhb@FreeBSD.org> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r280279 - head/sys/sys Message-ID: <20150320130216.GS2379@kib.kiev.ua> In-Reply-To: <201503201027.t2KAR6Ze053047@svn.freebsd.org> References: <201503201027.t2KAR6Ze053047@svn.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Mar 20, 2015 at 10:27:06AM +0000, John Baldwin wrote: > Author: jhb > Date: Fri Mar 20 10:27:06 2015 > New Revision: 280279 > URL: https://svnweb.freebsd.org/changeset/base/280279 > > Log: > Expand the bitcount* API to support 64-bit integers, plain ints and longs > and create a "hidden" API that can be used in other system headers without > adding namespace pollution. > - If the POPCNT instruction is enabled at compile time, use > __builtin_popcount*() to implement __bitcount*(), otherwise fall back > to software implementations. Are you aware of the Haswell errata HSD146 ? I see the described behaviour on machines back to SandyBridge, but not on Nehalems. HSD146. POPCNT Instruction May Take Longer to Execute Than Expected Problem: POPCNT instruction execution with a 32 or 64 bit operand may be delayed until previous non-dependent instructions have executed. Jilles noted that gcc head and 4.9.2 already provides a workaround by xoring the dst register. I have some patch for amd64 pmap, see the end of the message. > - Use the existing bitcount16() and bitcount32() from <sys/systm.h> to > implement the non-POPCNT __bitcount16() and __bitcount32() in > <sys/types.h>. Why is it in sys/types.h ? > - For the non-POPCNT __bitcount64(), use a similar SWAR method on 64-bit > systems. For 32-bit systems, use two __bitcount32() operations on the > two halves. > - Use __bitcount32() to provide a __bitcount() that operates on plain ints. > - Use either __bitcount32() or __bitcount64() to provide a > __bitcountl() that operates on longs. > - Add public bitcount*() wrappers for __bitcount*() for use in the kernel > in <sys/libkern.h>. > - Use __builtinl() instead of __builtin_popcountl() in BIT_COUNT(). > > Discussed with: bde diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c index 6a4077c..f6fbc33 100644 --- a/sys/amd64/amd64/pmap.c +++ b/sys/amd64/amd64/pmap.c @@ -413,6 +417,7 @@ static void free_pv_chunk(struct pv_chunk *pc); static void free_pv_entry(pmap_t pmap, pv_entry_t pv); static pv_entry_t get_pv_entry(pmap_t pmap, struct rwlock **lockp); static int popcnt_pc_map_elem(uint64_t elem); +static int popcnt_pc_map_elem_pq(uint64_t elem); static vm_page_t reclaim_pv_chunk(pmap_t locked_pmap, struct rwlock **lockp); static void reserve_pv_entries(pmap_t pmap, int needed, struct rwlock **lockp); @@ -2997,6 +3020,29 @@ popcnt_pc_map_elem(uint64_t elem) } /* + * The erratas for Intel processors state that "POPCNT Instruction May + * Take Longer to Execute Than Expected". It is believed that the + * issue is the spurious dependency on the destination register. + * Provide a hint to the register rename logic that the destination + * value is overwritten, by clearing it, as suggested in the + * optimization manual. It should be cheap for unaffected processors + * as well. + * + * Reference numbers for erratas are + * 4th Gen Core: HSD146 + * 5th Gen Core: BDM85 + */ +static int +popcnt_pc_map_elem_pq(uint64_t elem) +{ + u_long result; + + __asm __volatile("xorl %k0,%k0;popcntq %1,%0" + : "=&r" (result) : "rm" (elem)); + return (result); +} + +/* * Ensure that the number of spare PV entries in the specified pmap meets or * exceeds the given count, "needed". * @@ -3029,9 +3075,9 @@ retry: free += popcnt_pc_map_elem(pc->pc_map[1]); free += popcnt_pc_map_elem(pc->pc_map[2]); } else { - free = popcntq(pc->pc_map[0]); - free += popcntq(pc->pc_map[1]); - free += popcntq(pc->pc_map[2]); + free = popcnt_pc_map_elem_pq(pc->pc_map[0]); + free += popcnt_pc_map_elem_pq(pc->pc_map[1]); + free += popcnt_pc_map_elem_pq(pc->pc_map[2]); } if (free == 0) break;
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150320130216.GS2379>