Date: Sun, 7 Jun 2015 10:43:03 -0700 From: Davide Italiano <davide@freebsd.org> To: Garrett Cooper <yaneurabeya@gmail.com> Cc: Konstantin Belousov <kostikbel@gmail.com>, Hans Petter Selasky <hps@selasky.org>, "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>, Erich Dollansky <erichsfreebsdlist@alogt.com> Subject: Re: allow ffs & co. a binary search Message-ID: <CACYV=-GQ=WHyv3-6nghsyvZpsHyxBjgGtM8wprjBBi730t%2BeCg@mail.gmail.com> In-Reply-To: <A9391ACC-A611-492C-B2F0-DDD701054D7A@gmail.com> References: <20150607081315.7c0f09fb@B85M-HD3-0.alogt.com> <5573EA5E.40806@selasky.org> <20150607195245.62dc191f@B85M-HD3-0.alogt.com> <20150607135453.GH2499@kib.kiev.ua> <CACYV=-Gdyr7kts_=-u8FWCXNjygL4GxSdnvepz2D34XLv%2BJ-jw@mail.gmail.com> <CACYV=-GXmBxx3aXLPU1BjQvN3JvkhFVKHmLi6y2qUBaDdTdUKA@mail.gmail.com> <A9391ACC-A611-492C-B2F0-DDD701054D7A@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Jun 7, 2015 at 10:23 AM, Garrett Cooper <yaneurabeya@gmail.com> wro= te: > On Jun 7, 2015, at 9:44, Davide Italiano <davide@freebsd.org> wrote: > >> On Sun, Jun 7, 2015 at 9:37 AM, Davide Italiano <davide@freebsd.org> wro= te: >>> On Sun, Jun 7, 2015 at 6:54 AM, Konstantin Belousov <kostikbel@gmail.co= m> wrote: >>>> On Sun, Jun 07, 2015 at 07:52:45PM +0800, Erich Dollansky wrote: >>>>> What I saw is that all CPUs except ARM uses the software version [of = ffs]. >>>> >>>> Without quantifiers, this statement is not true. i386 libc function ff= s(3) >>>> uses bsfl instruction to do the job. Compilers know about ffs(3) and = friends >>>> as well, so e.g. gcc 5.1.0 generates the following code for the given >>>> fragment: >>>> return (ffs(x) + 1); >>>> is translated to >>>> 0: 0f bc c7 bsf %edi,%eax >>>> 3: ba ff ff ff ff mov $0xffffffff,%edx >>>> 8: 0f 44 c2 cmove %edx,%eax >>>> b: 83 c0 02 add $0x2,%eax >>>> (arg in %edi, result in %eax). >>>> >>>> I wrote a patch for amd64 libc long time ago to convert ffs/fls etc to= use >>>> of the bitstring instruction, but Bruce Evans argued that this would b= e >>>> excessive. Your patch is excessive for the similar reasons. >>>> >>>> My guess is that significantly clever compiler would recognize a patte= rn >>>> used by native ffs implementation and automatically use bitstring >>>> instructions. E.g., this already happens with popcnt and recent >>>> gcc/clang, I am just lazy to verify ffs. >>> >>> Clang trunk to the best of my knowledgde hasn't a way to recognize >>> ffs() pattern. >>> http://llvm.org/docs/doxygen/html/LoopIdiomRecognize_8cpp_source.html >>> I can't comment about gcc as long as I'm not familiar with the implemen= tation. >> >> Also, FWIW, for the fragment provided by Kostik, clang seems to >> generate more instructions than gcc does, >> I'll bring this upstream. >> >> 0: 0f bc c7 bsf %edi,%eax >> 3: b9 20 00 00 00 mov $0x20,%ecx >> 8: 0f 45 c8 cmovne %eax,%ecx >> b: 83 c1 02 add $0x2,%ecx >> e: 85 ff test %edi,%edi >> 10: b8 01 00 00 00 mov $0x1,%eax >> 15: 0f 45 c1 cmovne %ecx,%eax > > Hi Davide, > Could you please post a) the clang/gcc compiler versions and b) t= he compilation options used when producing the snippet? clang -O0 for insta= nce produces a lot of code, whereas -O2 produces considerably less (to the = point that there have been a lot of complaints at $work about being able to= debug clang-compiled programs because things have all been optimized out). > Thanks! > -NGie About the version, as previously mentioned, it's trunk (as per today), r239= 228. About the flags, it's -O3 -fomit-frame-pointer. I'm able to obtain the same code Kostik got w/ gcc and the same flag, so I definitely feel like clang does something different, which result in more instructions, and I'll investigate further on this. Thanks, --=20 Davide "There are no solved problems; there are only problems that are more or less solved" -- Henri Poincare
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CACYV=-GQ=WHyv3-6nghsyvZpsHyxBjgGtM8wprjBBi730t%2BeCg>