Date: Sat, 21 Mar 2015 09:42:51 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Konstantin Belousov <kostikbel@gmail.com> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, John Baldwin <jhb@freebsd.org> Subject: Re: svn commit: r280279 - head/sys/sys Message-ID: <20150321085923.U1046@besplex.bde.org> In-Reply-To: <20150320130216.GS2379@kib.kiev.ua> References: <201503201027.t2KAR6Ze053047@svn.freebsd.org> <20150320130216.GS2379@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 20 Mar 2015, Konstantin Belousov wrote: > On Fri, Mar 20, 2015 at 10:27:06AM +0000, John Baldwin wrote: >> Author: jhb >> Date: Fri Mar 20 10:27:06 2015 >> New Revision: 280279 >> URL: https://svnweb.freebsd.org/changeset/base/280279 >> >> Log: >> Expand the bitcount* API to support 64-bit integers, plain ints and longs >> and create a "hidden" API that can be used in other system headers without >> adding namespace pollution. >> - If the POPCNT instruction is enabled at compile time, use >> __builtin_popcount*() to implement __bitcount*(), otherwise fall back >> to software implementations. > Are you aware of the Haswell errata HSD146 ? I see the described behaviour I wasn't. > on machines back to SandyBridge, but not on Nehalems. > HSD146. POPCNT Instruction May Take Longer to Execute Than Expected > Problem: POPCNT instruction execution with a 32 or 64 bit operand may be > delayed until previous non-dependent instructions have executed. If it only affects performance, then it is up to the compiler to fix it. > Jilles noted that gcc head and 4.9.2 already provides a workaround by > xoring the dst register. I have some patch for amd64 pmap, see the end > of the message. IIRC, then patch never never uses asm, but intentionally uses the popcount builtin to avoid complications. >> - Use the existing bitcount16() and bitcount32() from <sys/systm.h> to >> implement the non-POPCNT __bitcount16() and __bitcount32() in >> <sys/types.h>. > Why is it in sys/types.h ? To make it easier to use, while minimizing namespace pollution and inefficiencies. Like the functions used to implement ntohl(), except the implementation is MI so it doesn't need to be in <machine>. (The functions used to implement ntohl() are in machine/endian.h. sys/types.h always includes that, so it makes little difference to pollution and inefficiency that the implementation is not more directly in machine/_types.h.) bitcount is simpler and not burdened by compatibility, so it doesn't need a separate header.) Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150321085923.U1046>