From owner-freebsd-arch@FreeBSD.ORG Thu Nov 15 22:54:37 2012 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 003B9ED6 for ; Thu, 15 Nov 2012 22:54:36 +0000 (UTC) (envelope-from jkim@FreeBSD.org) Received: from hammer.pct.niksun.com (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 94AF38FC12; Thu, 15 Nov 2012 22:54:36 +0000 (UTC) Message-ID: <50A57279.3040607@FreeBSD.org> Date: Thu, 15 Nov 2012 17:53:45 -0500 From: Jung-uk Kim User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:16.0) Gecko/20121031 Thunderbird/16.0.2 MIME-Version: 1.0 To: Bruce Evans Subject: Re: [RFC] Generic population count function References: <50A43B52.8030102@FreeBSD.org> <20121115181009.D42162@besplex.bde.org> In-Reply-To: <20121115181009.D42162@besplex.bde.org> X-Enigmail-Version: 1.4.5 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-arch@FreeBSD.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Nov 2012 22:54:37 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-11-15 03:32:50 -0500, Bruce Evans wrote: > The implementation of the ffs() family has similar non-problems. > No one cares because they are rarely used, but... cperciva@ wrote > the lookup table part of the following set of implementations of > ffs() and fls(). A lookup table is the best general method, though > it can be beaten by unportable methodw in some cases. I added more > methods and a test framework to this (test framework not always > included). The methods are: - gcc builtin - cpufunc inline Talking about cpufunc ffsl(), I found something strange about clang: % cat ffs.c #include #include int ffs_cpufunc(long x) /* copied from cpufunc.h as is */ { return (x == 0 ? x : (int)bsfq((u_long)x) + 1); } int ffs_builtin(long x) { return (__builtin_ffsl(x)); } % clang -O2 -c ffs.c % objdump -d ffs.o ffs.o: file format elf64-x86-64-freebsd Disassembly of section .text: 0000000000000000 : 0: 48 85 ff test %rdi,%rdi 3: 74 21 je 26 5: 48 89 7c 24 f8 mov %rdi,-0x8(%rsp) a: 48 0f bc 4c 24 f8 bsf -0x8(%rsp),%rcx 10: 48 c1 e1 20 shl $0x20,%rcx 14: 48 b8 00 00 00 00 01 mov $0x100000000,%rax 1b: 00 00 00 1e: 48 01 c8 add %rcx,%rax 21: 48 c1 e8 20 shr $0x20,%rax 25: c3 retq 26: 31 c0 xor %eax,%eax 28: c3 retq 29: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 0000000000000030 : 30: 48 0f bc cf bsf %rdi,%rcx 34: ff c1 inc %ecx 36: 31 c0 xor %eax,%eax 38: 48 85 ff test %rdi,%rdi 3b: 0f 45 c1 cmovne %ecx,%eax 3e: c3 retq Note clang generates very inefficient code for ffsl() from cpufunc.h. It is using memory, setting an unused bit, etc. :-( Both GCC 4.2 and 4.8 generate "good code", of course: % gcc -O2 -c ffs.c % objdump -d ffs.o ffs.o: file format elf64-x86-64-freebsd Disassembly of section .text: 0000000000000000 : 0: 31 c0 xor %eax,%eax 2: 48 85 ff test %rdi,%rdi 5: 74 07 je e 7: 48 0f bc c7 bsf %rdi,%rax b: 83 c0 01 add $0x1,%eax e: f3 c3 repz retq 0000000000000010 : 10: 48 0f bc ff bsf %rdi,%rdi 14: 48 c7 c0 ff ff ff ff mov $0xffffffffffffffff,%rax 1b: 48 0f 44 f8 cmove %rax,%rdi 1f: 48 83 c7 01 add $0x1,%rdi 23: 89 f8 mov %edi,%eax 25: c3 retq % gcc48 -O2 -c ffs.c % objdump -d ffs.o ffs.o: file format elf64-x86-64-freebsd Disassembly of section .text: 0000000000000000 : 0: 31 c0 xor %eax,%eax 2: 48 85 ff test %rdi,%rdi 5: 75 09 jne 10 7: f3 c3 repz retq 9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 10: 48 0f bc c7 bsf %rdi,%rax 14: 83 c0 01 add $0x1,%eax 17: c3 retq 18: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1) 1f: 00 0000000000000020 : 20: 48 0f bc c7 bsf %rdi,%rax 24: 48 c7 c2 ff ff ff ff mov $0xffffffffffffffff,%rdx 2b: 48 0f 44 c2 cmove %rdx,%rax 2f: 48 83 c0 01 add $0x1,%rax 33: c3 retq FYI, Jung-uk Kim -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) Comment: Using GnuPG with Mozilla - http://www.enigmail.net/ iEYEARECAAYFAlClcnkACgkQmlay1b9qnVPQYACgjEYpPDgLfkskpm3+2vRNjCuJ n5AAn30JxrFS9kLoejh+4BZFD/MnfAvT =4yS7 -----END PGP SIGNATURE-----