From owner-freebsd-hackers@FreeBSD.ORG Sun Jun 14 01:59:56 2015 Return-Path: Delivered-To: freebsd-hackers@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C8ED7CB9; Sun, 14 Jun 2015 01:59:56 +0000 (UTC) (envelope-from erichsfreebsdlist@alogt.com) Received: from alogt.com (alogt.com [69.36.191.58]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A4DE8280; Sun, 14 Jun 2015 01:59:56 +0000 (UTC) (envelope-from erichsfreebsdlist@alogt.com) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=alogt.com; s=default; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References:In-Reply-To:Message-ID:Subject:Cc:To:From:Date; bh=AyC/SFwruEWP3Gurwe/qr3PTl5RjTjrmYSY1LYZle1Q=; b=psYYYbw+T1IthvgGa+2jJeyHSo62hzQinjJg8er77tb8Uj9ESWEsxN6x5nBIoRQssPyst4XDuC+dmmRahTeSO48xlZyOkqhTPoIKdqxyHIPrPR0lEwo4u3Rg2Bt16isYwO4gvLAt23B6HP9mcmisvTxijEYZc/ul6QbZ3qiynJs=; Received: from [114.121.135.137] (port=21877 helo=B85M-HD3-0.alogt.com) by sl-508-2.slc.westdc.net with esmtpsa (TLSv1.2:AES128-GCM-SHA256:128) (Exim 4.85) (envelope-from ) id 1Z3xDA-001vzo-Ei; Sat, 13 Jun 2015 19:59:49 -0600 Date: Sun, 14 Jun 2015 09:59:41 +0800 From: Erich Dollansky To: Richard Yao Cc: Hans Petter Selasky , Ian Lepore , "freebsd-hackers@freebsd.org" Subject: Re: allow ffs & co. a binary search Message-ID: <20150614095941.667035a6@B85M-HD3-0.alogt.com> In-Reply-To: <557ADF46.6030004@gentoo.org> References: <20150607081315.7c0f09fb@B85M-HD3-0.alogt.com> <5579A016.4040800@gentoo.org> <1434034373.1415.3.camel@freebsd.org> <5579A235.80609@gentoo.org> <557A3CF3.3050603@selasky.org> <557AD8B0.6000600@gentoo.org> <557ADF46.6030004@gentoo.org> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - sl-508-2.slc.westdc.net X-AntiAbuse: Original Domain - freebsd.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - alogt.com X-Get-Message-Sender-Via: sl-508-2.slc.westdc.net: authenticated_id: erichsfreebsdlist@alogt.com X-Source: X-Source-Args: X-Source-Dir: X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Jun 2015 01:59:56 -0000 Hi All, what should we do now? There are several approaches now on the table. I would implement this use the built-in methods of the compiler when available and use the Stanford functions in other cases and and keep the current solution as a fall back I would use macros to determine which solution should be used. What are your thoughts? Erich On Fri, 12 Jun 2015 09:31:50 -0400 Richard Yao wrote: > On 06/12/2015 09:03 AM, Richard Yao wrote: > > On 06/11/2015 09:59 PM, Hans Petter Selasky wrote: > >> On 06/11/15 16:59, Richard Yao wrote: > >>> On 06/11/2015 10:52 AM, Ian Lepore wrote: > >>>> On Thu, 2015-06-11 at 10:49 -0400, Richard Yao wrote: > >>>>> On 06/06/2015 08:13 PM, Erich Dollansky wrote: > >>>> [...] > >>>>> > >>>>> The fastest portable way of calculating highest bit set is to > >>>>> use a debrujin sequence: > >>>>> > >>>>> https://graphics.stanford.edu/~seander/bithacks.html#IntegerLogDeBruijn > >>>>> > >>>>> The binary approach is inferior, especially on Pentium 4 > >>>>> processors. > >>>> > >>>> And of course it's crucial that we optimize for Pentium 4 > >>>> processors in 2015. > >>> > >>> The debrujin approach is fast regardless of the architecture. > >>> Anyway, this is a solved problem. There is no need to reinvent > >>> the wheel when the portable solutions are listed on that page at > >>> Stanford. > >>> > >> > >> Hi Richard, > >> > >> ffs() can be implemented like this simply: > >> > >> int ffs_A(uint32_t v) > >> { > >> static const int MultiplyDeBruijnBitPosition[32] = > >> { > >> 0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30, > >> 8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31 > >> }; > >> v = ~(v - 1) & v; > >> return (MultiplyDeBruijnBitPosition[(uint32_t)(v * 0x07C4ACDDU) >> > >> 27]); } > > > > There is a way to do that on the stanford page too: > > > > https://graphics.stanford.edu/~seander/bithacks.html#ZerosOnRightMultLookup > > My apologies. I inadvertently linked to the wrong variant of that in > my initial email. It has been a while since I visited that page and I > was juggling a few things when I wrote that email, so it was easy to > confuse the two. > > Anyway, it would be possible to do something like: > > #if defined(__GNUC__) || (defined(__clang__) && > __has_builtin(__builtin_ffs)) > # define ffs(n) __builtin_ffs(n) > #else > /* > * From Sean Eron Anderson's Bit Twiddling Hacks > * > https://graphics.stanford.edu/~seander/bithacks.html#ZerosOnRightMultLookup > */ > static inline int > ffs(uint32_t v) > { > static const int MultiplyDeBruijnBitPosition[32] = > { > 0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30, > 8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31 > }; > v = ~(v - 1) & v; > return (MultiplyDeBruijnBitPosition[(uint32_t)(v * 0x07C4ACDDU) >> > 27]); } > #endif > > If there were an autotools-like check for `__has_builtin()`, then an > explicit list of compilers known to support `__builtin_ffs()` would > not be necessary, although a list of compilers that lack support for > `__has_builtin()` (such as GCC) would still be necessary. Those > building with compilers that do not support `__builtin_ffs()` could > fallback to the DeBrujin approach. >