Date: Sat, 7 May 2011 13:47:48 +0300 From: Kostik Belousov <kostikbel@gmail.com> To: Andriy Gapon <avg@freebsd.org> Cc: current@freebsd.org Subject: Re: bitcount32: replace lengthy comment with SWAR reference Message-ID: <20110507104748.GL48734@deviant.kiev.zoral.com.ua> In-Reply-To: <4DC517B3.8050502@FreeBSD.org> References: <4DC517B3.8050502@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--0uEakrcbPVGl0QO8 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, May 07, 2011 at 12:58:11PM +0300, Andriy Gapon wrote: >=20 > I think that the SWAR reference should be more concise and should lead an > interested reader to more information on the topic (and more up-to-date a= s well). >=20 SWAR acronim seems to be a wikipedia invention. Without the abbreviation, the trimmed down comment looks better, IMHO. > bitcount32: replace lengthy comment with SWAR reference >=20 > diff --git a/sys/sys/systm.h b/sys/sys/systm.h > index 48bb33c..5218988 100644 > --- a/sys/sys/systm.h > +++ b/sys/sys/systm.h > @@ -383,44 +383,8 @@ int alloc_unrl(struct unrhdr *uh); > void free_unr(struct unrhdr *uh, u_int item); >=20 > /* > - * This is about as magic as it gets. fortune(1) has got similar code > - * for reversing bits in a word. Who thinks up this stuff?? > - * > - * Yes, it does appear to be consistently faster than: > - * while (i =3D ffs(m)) { > - * m >>=3D i; > - * bits++; > - * } > - * and > - * while (lsb =3D (m & -m)) { // This is magic too > - * m &=3D ~lsb; // or: m ^=3D lsb > - * bits++; > - * } > - * Both of these latter forms do some very strange things on gcc-3.1 with > - * -mcpu=3Dpentiumpro and/or -march=3Dpentiumpro and/or -O or -O2. > - * There is probably an SSE or MMX popcnt instruction. > - * > - * I wonder if this should be in libkern? > - * > - * XXX Stop the presses! Another one: > - * static __inline u_int32_t > - * popcnt1(u_int32_t v) > - * { > - * v -=3D ((v >> 1) & 0x55555555); > - * v =3D (v & 0x33333333) + ((v >> 2) & 0x33333333); > - * v =3D (v + (v >> 4)) & 0x0F0F0F0F; > - * return (v * 0x01010101) >> 24; > - * } > - * The downside is that it has a multiply. With a pentium3 with > - * -mcpu=3Dpentiumpro and -march=3Dpentiumpro then gcc-3.1 will use > - * an imull, and in that case it is faster. In most other cases > - * it appears slightly slower. > - * > - * Another variant (also from fortune): > - * #define BITCOUNT(x) (((BX_(x)+(BX_(x)>>4)) & 0x0F0F0F0F) % 255) > - * #define BX_(x) ((x) - (((x)>>1)&0x77777777) \ > - * - (((x)>>2)&0x33333333) \ > - * - (((x)>>3)&0x11111111)) > + * Population count algorithm using SWAR approach > + * - "SIMD Within A Register". > */ > static __inline uint32_t > bitcount32(uint32_t x) >=20 > --=20 > Andriy Gapon > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" --0uEakrcbPVGl0QO8 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (FreeBSD) iEYEARECAAYFAk3FI1QACgkQC3+MBN1Mb4hPHQCguv3I3v9VvfF/Q+DxEROefVhl DhoAniVOV+GDTawpjoToPhwHgDWLfAV0 =V6ki -----END PGP SIGNATURE----- --0uEakrcbPVGl0QO8--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110507104748.GL48734>