Date: Fri, 23 Oct 2015 21:02:17 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: "Conrad E. Meyer" <cem@freebsd.org> Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r289765 - in head/sys: conf libkern sys Message-ID: <20151023200654.T1687@besplex.bde.org> In-Reply-To: <201510222028.t9MKSbRu032869@repo.freebsd.org> References: <201510222028.t9MKSbRu032869@repo.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 22 Oct 2015, Conrad E. Meyer wrote: > Log: > Add libkern ffsll() for parity with flsll() > > Sponsored by: EMC / Isilon Storage Division This fills a much needed gap, like the one filled by flsll(). It uses the long long abomination. uintmax_t has been Standard since 1999, but is still not supported by the ffs() family. It further expands the design error in ffs(). (ffs() only supports ints. Its behaviour on negative values is undocumented, but needs documentation more than for non-negative values because even in the 2's complement case, the sign bit can be anywhere). The bad BSD documented is for ffs() is duplicated in POSIX in at least old versions. I think this requires ffs() to operate on values, so it cannot work in the 1's complement case. The -0 has all bits 1 but value 0. ffs(-0) by value must be -1, and ABIs are probably allowed to pass -0 as +0 so ffs(-0) is probably -1 even if ffs() is by bits.) Due to this design error, it is not obvious that even u_char is directly supported by the ffs() family. It is supported because POSIX requires 8-bit chars so chars get promoted to non-negative ints when passed to ffs(). This doesn't happen for u_int, but you can promote u_int manually to long long to get correct and pessimal code. This doesn't work for u_foo that is already of larger rank than long long. > Added: head/sys/libkern/ffsll.c > ============================================================================== > --- /dev/null 00:00:00 1970 (empty, because file is newly added) > +++ head/sys/libkern/ffsll.c Thu Oct 22 20:28:37 2015 (r289765) > + ... > +/* > + * Find First Set bit > + */ > +int > +ffsll(long long mask) > +{ > + int bit; > + > + if (mask == 0) > + return (0); > + for (bit = 1; !(mask & 1); bit++) > + mask = (unsigned long long)mask >> 1; > + return (bit); > +} This has the usual implementation for negative values. They are cast to the correct unsigned type to get defined behaviour for the shift. The cast itself gives implementation-defined behaviour. Even when the compiler was gcc so that it was documented in man pages and info pages, the documentation for this was hard to find. Whatever it is, it gives the undocumented behaviour of the function. This implementation is pessimal unless the compiler can convert it to special instructions. On i386, 2 32-bit ffs()'s are more portable and much faster, since each one uses a special instruction and it is not possible to do much better. (But fls64() can be done better in userland using the FPU or perhaps SSE.) clang optimizes similar simple loops for popcount() but doesn't optimize the ones used in libc for the ffs() family. If it did optimize these loops, 2 32-bit ffs()'s with asms inside them wouldn't be so good, but for bswap64() we arranged for compilers to good combining for even more complicated variations. (For bswap*, clang converts the complicated C expression giving the rearrangement to minimal bswap instruction(s), but we hard-code these instructions in asm because gcc can't do this in the old versions in FreeBSD, and then arrange so that clang can combine the asms efficiently on i386.) Efficiency of the ffs() family is rarely important. bitset.h uses ffsl() but bitstring.h uses a simple loop. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20151023200654.T1687>