From owner-freebsd-current@FreeBSD.ORG Sat May 7 10:47:54 2011 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 20388106566C; Sat, 7 May 2011 10:47:54 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id A96A58FC15; Sat, 7 May 2011 10:47:53 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id p47AlmpT057621 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 7 May 2011 13:47:49 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id p47Almtd000217; Sat, 7 May 2011 13:47:48 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id p47Almfw000216; Sat, 7 May 2011 13:47:48 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 7 May 2011 13:47:48 +0300 From: Kostik Belousov To: Andriy Gapon Message-ID: <20110507104748.GL48734@deviant.kiev.zoral.com.ua> References: <4DC517B3.8050502@FreeBSD.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="0uEakrcbPVGl0QO8" Content-Disposition: inline In-Reply-To: <4DC517B3.8050502@FreeBSD.org> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-3.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: current@freebsd.org Subject: Re: bitcount32: replace lengthy comment with SWAR reference X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 May 2011 10:47:54 -0000 --0uEakrcbPVGl0QO8 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, May 07, 2011 at 12:58:11PM +0300, Andriy Gapon wrote: >=20 > I think that the SWAR reference should be more concise and should lead an > interested reader to more information on the topic (and more up-to-date a= s well). >=20 SWAR acronim seems to be a wikipedia invention. Without the abbreviation, the trimmed down comment looks better, IMHO. > bitcount32: replace lengthy comment with SWAR reference >=20 > diff --git a/sys/sys/systm.h b/sys/sys/systm.h > index 48bb33c..5218988 100644 > --- a/sys/sys/systm.h > +++ b/sys/sys/systm.h > @@ -383,44 +383,8 @@ int alloc_unrl(struct unrhdr *uh); > void free_unr(struct unrhdr *uh, u_int item); >=20 > /* > - * This is about as magic as it gets. fortune(1) has got similar code > - * for reversing bits in a word. Who thinks up this stuff?? > - * > - * Yes, it does appear to be consistently faster than: > - * while (i =3D ffs(m)) { > - * m >>=3D i; > - * bits++; > - * } > - * and > - * while (lsb =3D (m & -m)) { // This is magic too > - * m &=3D ~lsb; // or: m ^=3D lsb > - * bits++; > - * } > - * Both of these latter forms do some very strange things on gcc-3.1 with > - * -mcpu=3Dpentiumpro and/or -march=3Dpentiumpro and/or -O or -O2. > - * There is probably an SSE or MMX popcnt instruction. > - * > - * I wonder if this should be in libkern? > - * > - * XXX Stop the presses! Another one: > - * static __inline u_int32_t > - * popcnt1(u_int32_t v) > - * { > - * v -=3D ((v >> 1) & 0x55555555); > - * v =3D (v & 0x33333333) + ((v >> 2) & 0x33333333); > - * v =3D (v + (v >> 4)) & 0x0F0F0F0F; > - * return (v * 0x01010101) >> 24; > - * } > - * The downside is that it has a multiply. With a pentium3 with > - * -mcpu=3Dpentiumpro and -march=3Dpentiumpro then gcc-3.1 will use > - * an imull, and in that case it is faster. In most other cases > - * it appears slightly slower. > - * > - * Another variant (also from fortune): > - * #define BITCOUNT(x) (((BX_(x)+(BX_(x)>>4)) & 0x0F0F0F0F) % 255) > - * #define BX_(x) ((x) - (((x)>>1)&0x77777777) \ > - * - (((x)>>2)&0x33333333) \ > - * - (((x)>>3)&0x11111111)) > + * Population count algorithm using SWAR approach > + * - "SIMD Within A Register". > */ > static __inline uint32_t > bitcount32(uint32_t x) >=20 > --=20 > Andriy Gapon > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" --0uEakrcbPVGl0QO8 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (FreeBSD) iEYEARECAAYFAk3FI1QACgkQC3+MBN1Mb4hPHQCguv3I3v9VvfF/Q+DxEROefVhl DhoAniVOV+GDTawpjoToPhwHgDWLfAV0 =V6ki -----END PGP SIGNATURE----- --0uEakrcbPVGl0QO8--