From owner-freebsd-hackers@FreeBSD.ORG Sun Jun 7 17:43:04 2015 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 50EF1EB for ; Sun, 7 Jun 2015 17:43:04 +0000 (UTC) (envelope-from davide.italiano@gmail.com) Received: from mail-qk0-x236.google.com (mail-qk0-x236.google.com [IPv6:2607:f8b0:400d:c09::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 078E915A5 for ; Sun, 7 Jun 2015 17:43:04 +0000 (UTC) (envelope-from davide.italiano@gmail.com) Received: by qkx62 with SMTP id 62so67244188qkx.3 for ; Sun, 07 Jun 2015 10:43:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=VJ2LToVLSThIcfR8iC9D+cKK3SmOlBQu63UsvS9pUiw=; b=RdMts/vdH6wBWuA1du8jSjP4KEOAlDdafX8eH0ocwLL7zT+lRbiDvrBbtMFqyCvtVZ /f1cLO/LQpsOdTKd/f62gPn0lvKIl2z0OzPwgbtghoLkbzQrLkLlHCW1/dP822J9jihx mjevJkGQR3qKUwgdjXS8YfgEilDQuttetdxlOEDvKpiM7zLuzyHvn+7LRirPvu/TS1mA mOxqxBbuan8ZVWW35/FihYKjLHsCXSdjA/l01UcdYJjqCCA8xWnPfEpyCoNv2AAjsVV/ ngTneqEiH8SX9Zmd/n7SmERMOPW/GxAsN8y2HVTbcNgwYDXUOONbeuG6goXeL02los0m Kzxw== MIME-Version: 1.0 X-Received: by 10.55.50.141 with SMTP id y135mr25213721qky.91.1433698983136; Sun, 07 Jun 2015 10:43:03 -0700 (PDT) Sender: davide.italiano@gmail.com Received: by 10.96.106.234 with HTTP; Sun, 7 Jun 2015 10:43:03 -0700 (PDT) In-Reply-To: References: <20150607081315.7c0f09fb@B85M-HD3-0.alogt.com> <5573EA5E.40806@selasky.org> <20150607195245.62dc191f@B85M-HD3-0.alogt.com> <20150607135453.GH2499@kib.kiev.ua> Date: Sun, 7 Jun 2015 10:43:03 -0700 X-Google-Sender-Auth: lvyheMGgorn0Z2l9OJmSHqqiTLk Message-ID: Subject: Re: allow ffs & co. a binary search From: Davide Italiano To: Garrett Cooper Cc: Konstantin Belousov , Hans Petter Selasky , "freebsd-hackers@freebsd.org" , Erich Dollansky Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Jun 2015 17:43:04 -0000 On Sun, Jun 7, 2015 at 10:23 AM, Garrett Cooper wro= te: > On Jun 7, 2015, at 9:44, Davide Italiano wrote: > >> On Sun, Jun 7, 2015 at 9:37 AM, Davide Italiano wro= te: >>> On Sun, Jun 7, 2015 at 6:54 AM, Konstantin Belousov wrote: >>>> On Sun, Jun 07, 2015 at 07:52:45PM +0800, Erich Dollansky wrote: >>>>> What I saw is that all CPUs except ARM uses the software version [of = ffs]. >>>> >>>> Without quantifiers, this statement is not true. i386 libc function ff= s(3) >>>> uses bsfl instruction to do the job. Compilers know about ffs(3) and = friends >>>> as well, so e.g. gcc 5.1.0 generates the following code for the given >>>> fragment: >>>> return (ffs(x) + 1); >>>> is translated to >>>> 0: 0f bc c7 bsf %edi,%eax >>>> 3: ba ff ff ff ff mov $0xffffffff,%edx >>>> 8: 0f 44 c2 cmove %edx,%eax >>>> b: 83 c0 02 add $0x2,%eax >>>> (arg in %edi, result in %eax). >>>> >>>> I wrote a patch for amd64 libc long time ago to convert ffs/fls etc to= use >>>> of the bitstring instruction, but Bruce Evans argued that this would b= e >>>> excessive. Your patch is excessive for the similar reasons. >>>> >>>> My guess is that significantly clever compiler would recognize a patte= rn >>>> used by native ffs implementation and automatically use bitstring >>>> instructions. E.g., this already happens with popcnt and recent >>>> gcc/clang, I am just lazy to verify ffs. >>> >>> Clang trunk to the best of my knowledgde hasn't a way to recognize >>> ffs() pattern. >>> http://llvm.org/docs/doxygen/html/LoopIdiomRecognize_8cpp_source.html >>> I can't comment about gcc as long as I'm not familiar with the implemen= tation. >> >> Also, FWIW, for the fragment provided by Kostik, clang seems to >> generate more instructions than gcc does, >> I'll bring this upstream. >> >> 0: 0f bc c7 bsf %edi,%eax >> 3: b9 20 00 00 00 mov $0x20,%ecx >> 8: 0f 45 c8 cmovne %eax,%ecx >> b: 83 c1 02 add $0x2,%ecx >> e: 85 ff test %edi,%edi >> 10: b8 01 00 00 00 mov $0x1,%eax >> 15: 0f 45 c1 cmovne %ecx,%eax > > Hi Davide, > Could you please post a) the clang/gcc compiler versions and b) t= he compilation options used when producing the snippet? clang -O0 for insta= nce produces a lot of code, whereas -O2 produces considerably less (to the = point that there have been a lot of complaints at $work about being able to= debug clang-compiled programs because things have all been optimized out). > Thanks! > -NGie About the version, as previously mentioned, it's trunk (as per today), r239= 228. About the flags, it's -O3 -fomit-frame-pointer. I'm able to obtain the same code Kostik got w/ gcc and the same flag, so I definitely feel like clang does something different, which result in more instructions, and I'll investigate further on this. Thanks, --=20 Davide "There are no solved problems; there are only problems that are more or less solved" -- Henri Poincare