Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 21 Mar 2015 09:42:51 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, John Baldwin <jhb@freebsd.org>
Subject:   Re: svn commit: r280279 - head/sys/sys
Message-ID:  <20150321085923.U1046@besplex.bde.org>
In-Reply-To: <20150320130216.GS2379@kib.kiev.ua>
References:  <201503201027.t2KAR6Ze053047@svn.freebsd.org> <20150320130216.GS2379@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 20 Mar 2015, Konstantin Belousov wrote:

> On Fri, Mar 20, 2015 at 10:27:06AM +0000, John Baldwin wrote:
>> Author: jhb
>> Date: Fri Mar 20 10:27:06 2015
>> New Revision: 280279
>> URL: https://svnweb.freebsd.org/changeset/base/280279
>>
>> Log:
>>   Expand the bitcount* API to support 64-bit integers, plain ints and longs
>>   and create a "hidden" API that can be used in other system headers without
>>   adding namespace pollution.
>>   - If the POPCNT instruction is enabled at compile time, use
>>     __builtin_popcount*() to implement __bitcount*(), otherwise fall back
>>     to software implementations.

> Are you aware of the Haswell errata HSD146 ?  I see the described behaviour

I wasn't.

> on machines back to SandyBridge, but not on Nehalems.
> HSD146.   POPCNT Instruction May Take Longer to Execute Than Expected
> Problem: POPCNT instruction execution with a 32 or 64 bit operand may be
> delayed until previous non-dependent instructions have executed.

If it only affects performance, then it is up to the compiler to fix it.

> Jilles noted that gcc head and 4.9.2 already provides a workaround by
> xoring the dst register.  I have some patch for amd64 pmap, see the end
> of the message.

IIRC, then patch never never uses asm, but intentionally uses the popcount
builtin to avoid complications.

>>   - Use the existing bitcount16() and bitcount32() from <sys/systm.h> to
>>     implement the non-POPCNT __bitcount16() and __bitcount32() in
>>     <sys/types.h>.
> Why is it in sys/types.h ?

To make it easier to use, while minimizing namespace pollution and
inefficiencies.  Like the functions used to implement ntohl(), except
the implementation is MI so it doesn't need to be in <machine>.
(The functions used to implement ntohl() are in machine/endian.h.
sys/types.h always includes that, so it makes little difference to
pollution and inefficiency that the implementation is not more directly
in machine/_types.h.)  bitcount is simpler and not burdened by
compatibility, so it doesn't need a separate header.)

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150321085923.U1046>