Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 11 May 2019 20:52:44 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Doug Moore <dougm@freebsd.org>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org,  svn-src-head@freebsd.org
Subject:   Re: svn commit: r347484 - head/sys/kern
Message-ID:  <20190511202741.H1251@besplex.bde.org>
In-Reply-To: <201905110909.x4B99A5L006389@repo.freebsd.org>
References:  <201905110909.x4B99A5L006389@repo.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 11 May 2019, Doug Moore wrote:

> Log:
>  When bitpos can't be implemented with an inline ffs* instruction,
>  change the binary search so that it does not depend on a single bit
>  only being set in the bitmask. Use bitpos more generally, and avoid
>  some clearing of bits to accommodate its current behavior.

Inline ffs*() is badly implemented in general (it is mostly not implemented
in the kernel).  This makes it almost useless here, and (non-inline) ffs*()
(not used here) often slower than direct methods.

> Modified: head/sys/kern/subr_blist.c
> ==============================================================================
> --- head/sys/kern/subr_blist.c	Sat May 11 04:18:06 2019	(r347483)
> +++ head/sys/kern/subr_blist.c	Sat May 11 09:09:10 2019	(r347484)
> ...
> +static inline int
> +bitpos(u_daddr_t mask)
> +{
> +
> 	switch (sizeof(mask)) {
> #ifdef HAVE_INLINE_FFSLL
> 	case sizeof(long long):
> 		return (ffsll(mask) - 1);
> #endif

This uses the long long abomination.  Only amd64 has inline ffsll().

> +#ifdef HAVE_INLINE_FFS
> +	case sizeof(int):
> +		return (ffs(mask) - 1);
> +#endif

This is unreachable, since sizeof(int) is 4 on all supported arches, and
sizeof(mask) is 8 on all arches.

> 	default:
> -		lo = 0;
> -		hi = BLIST_BMAP_RADIX;
> -		while (lo + 1 < hi) {
> -			mid = (lo + hi) >> 1;
> -			if ((mask >> mid) != 0)
> -				lo = mid;
> -			else
> -				hi = mid;
> -		}
> -		return (lo);
> +		return (generic_bitpos(mask));
> 	}
> }

So the default cause is used except on amd64.  I think the direct code
for it is not as slow as the generic (libkern) ffs*(), except the generic
ffs*() is simpler so the compiler might inline it internals to
__builtin_ffs*().  Not that -ffreestanding prevents inlining extern ffs*().

Since 64/8 is 2 times 4 and sizeof(int) >= 4 in POSIX, it is trivial to
implement ffs64() using 2 inline ffs()'s, and that is exactly what clang
does on i386 to implement __builtin_ffsl().  Unfortunately, most arches
also don't implement inline ffs().  Only amd64, i386 and sparc64 implement
it.

subr_blist.c is also complilable outside of the kernel.  I don't like that.
The kernel macros HAVE_INLINE_* don't exist outside of the kernel, but
ffs*() would work better outside of the kernel if it is used blindly since
inlining of it isn't prevented by -ffreestanding.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190511202741.H1251>