From owner-svn-src-all@FreeBSD.ORG  Fri Dec  3 07:08:06 2010
Return-Path: <owner-svn-src-all@FreeBSD.ORG>
Delivered-To: svn-src-all@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 60F79106566C;
	Fri,  3 Dec 2010 07:08:06 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail05.syd.optusnet.com.au (mail05.syd.optusnet.com.au
	[211.29.132.186])
	by mx1.freebsd.org (Postfix) with ESMTP id D916C8FC16;
	Fri,  3 Dec 2010 07:08:05 +0000 (UTC)
Received: from c211-30-187-94.carlnfd1.nsw.optusnet.com.au
	(c211-30-187-94.carlnfd1.nsw.optusnet.com.au [211.30.187.94])
	by mail05.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	oB3782mp005589
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Fri, 3 Dec 2010 18:08:03 +1100
Date: Fri, 3 Dec 2010 18:08:02 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Nathan Whitehorn <nwhitehorn@FreeBSD.org>
In-Reply-To: <201012021510.oB2FARBU012912@svn.freebsd.org>
Message-ID: <20101203165036.P1131@besplex.bde.org>
References: <201012021510.oB2FARBU012912@svn.freebsd.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org,
	src-committers@FreeBSD.org
Subject: Re: svn commit: r216122 - head/sys/powerpc/include
X-BeenThere: svn-src-all@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "SVN commit messages for the entire src tree \(except for &quot;
	user&quot; and &quot; projects&quot; \)" <svn-src-all.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-all>,
	<mailto:svn-src-all-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-all>
List-Post: <mailto:svn-src-all@freebsd.org>
List-Help: <mailto:svn-src-all-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-all>,
	<mailto:svn-src-all-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 03 Dec 2010 07:08:06 -0000

On Thu, 2 Dec 2010, Nathan Whitehorn wrote:

> Log:
>  Define bswap macros for constants to allow the compiler to pre-compute
>  byte-swapped versions of compile-time constants. This allows use of
>  bswap() and htole*() in initializers, which is required to cross-build
>  btxld.

The constant case is MI so it shouldn't be repeated in many (now many + 1)
MD files.  I got cognet@ to clean this up a bit, but he wasn't able to
commit the main part (due to unresolved namespace problems IIRC).

>  Obtained from:	sparc64

amd64 is a better base.

> Modified: head/sys/powerpc/include/endian.h
> ==============================================================================
> --- head/sys/powerpc/include/endian.h	Thu Dec  2 13:40:21 2010	(r216121)
> +++ head/sys/powerpc/include/endian.h	Thu Dec  2 15:10:27 2010	(r216122)
> @@ -27,7 +27,6 @@
>  * SUCH DAMAGE.
>  *
>  *	@(#)endian.h	8.1 (Berkeley) 6/10/93
> - *	$NetBSD: endian.h,v 1.7 1999/08/21 05:53:51 simonb Exp $
>  * $FreeBSD$
>  */
>
> @@ -79,17 +78,36 @@
> #define	BYTE_ORDER	_BYTE_ORDER
> #endif
>
> -#ifdef __CC_SUPPORTS___INLINE

__inline is still used, but is no longer ifdefed.  Probably best.  Most
of the __CC_SUPPORTS macros are just usless obfuscations since they
are only used in a few places, and `inline' (but not __inline__) is
now Standard.  E.g., even amd64 and i386 don't use
__CC_SUPPORTS___INLINE in endian.h.  After this commit, only ia64's
endian.h uses __CC_SUPPORTS___INLINE.  The __CC_SUPPORTS_INLINE and
__CC_SUPPORTS___INLINE__ macros are even more useless than
__CC_SUPPORTS___INLINE (the former since plain inline is Standard
and the latter since __inline__ is a style bug which is used mainly
in contrib'ed code that is very unlikely to ifdef its mistakes.

> +#if defined(__GNUCLIKE_BUILTIN_CONSTANT_P)
> +#define	__is_constant(x)	__builtin_constant_p(x)
> +#else
> +#define	__is_constant(x)	0
> +#endif

This won't actually work if __GNUCLIKE_BUILTIN_CONSTANT_P is not defined,
since it says that everything is variable in that case, so it will generate
inline function calls in static initializers and fail messily.  amd64 and
i386 use a better ifdef which results in _BYTEORDER_FUNC_DEFINED being
defined.  This should result in upper layers don't the right thing,
which must include handling the const case so that the interfaces can
be used in static initializers, but in fact upper layers are more broken
than here:

     - of course upper layers don't handle the constant case
     - the only upper layers headers that test _BYTEORDER_FUNC_DEFINED are
       sys/param.h and netinet/in.h.  Since these are not sys/endian.h,
       and since the MD endian.h is included directly in other places
       than these (mainly sys/types.h), there are namespace problems.
     - in sys/param.h and netinet/in.h, _BYTEORDER_FUNC_DEFINED only affects
       the definitions of hton*() and ntohs*().  So most MD byteorder
       functions are left undefined.

In MD layers, only amd64, i386, ia64, and previously powerpc, define
_BYTEORDER_FUNC_DEFINED.  These declarations are confusing but not
wrong.  The byteorder functions that are covered by this macro are
only __hton*() and __ntoh*().  When these are defined, upper layers
refrain from defining hton*() and ntoh*(), so that in the usual case
where the latter are already defined, they are not redefined (although
they should be since redefinition of macros is harmless unless there
is a bug that makes the definitions differ), and in the unusual case
where the underscored functions are not defined, the non-underscored
functions remain undefined too, so that extern functions are used for
the latter.  Untangling this will be difficult, although it is just
an obfuscation in the usual case.

> +
> +#define	__bswap16_const(x)	((((__uint16_t)(x) >> 8) & 0xff) |	\
> +	(((__uint16_t)(x) << 8) & 0xff00))
> +#define	__bswap32_const(x)	((((__uint32_t)(x) >> 24) & 0xff) |	\
> +	(((__uint32_t)(x) >> 8) & 0xff00) |				\
> +	(((__uint32_t)(x)<< 8) & 0xff0000) |				\
> +	(((__uint32_t)(x) << 24) & 0xff000000))
> +#define	__bswap64_const(x)	((((__uint64_t)(x) >> 56) & 0xff) |	\
> +	(((__uint64_t)(x) >> 40) & 0xff00) |				\
> +	(((__uint64_t)(x) >> 24) & 0xff0000) |				\
> +	(((__uint64_t)(x) >> 8) & 0xff000000) |				\
> +	(((__uint64_t)(x) << 8) & ((__uint64_t)0xff << 32)) |		\
> +	(((__uint64_t)(x) << 24) & ((__uint64_t)0xff << 40)) |		\
> +	(((__uint64_t)(x) << 40) & ((__uint64_t)0xff << 48)) |		\
> +	(((__uint64_t)(x) << 56) & ((__uint64_t)0xff << 56)))

All MI.  Even on mips, all these macros, and corresponding functions too,
are MI, despite endianness differences.  All the swap APIs are MI.  It
is only the hton*() and ntoh*() APIs that are MD, since they sometimes
don't swap.

Mips has different mistakes in the ifdef for the constant case:

% #ifndef __ASSEMBLER__
% #if defined(__GNUCLIKE_BUILTIN_CONSTANT_P) && defined(__OPTIMIZE__)
% #define	__is_constant(x)	__builtin_constant_p(x)
% #else
% #define	__is_constant(x)	0
% #endif

__ASSEMBLER__ is not needed for the macros, and shouldn't be needed to
hide the functions, since this file shouldn't be included in asm files.

Using __OPTIMIZE__ mainly breaks the case of static initializers when
compiled without optimizations.

>
> static __inline __uint16_t
> -__bswap16(__uint16_t _x)
> +__bswap16_var(__uint16_t _x)
> {
>
> 	return ((_x >> 8) | ((_x << 8) & 0xff00));
> }
>
> static __inline __uint32_t
> -__bswap32(__uint32_t _x)
> +__bswap32_var(__uint32_t _x)
> {
>
> 	return ((_x >> 24) | ((_x >> 8) & 0xff00) | ((_x << 8) & 0xff0000) |
> @@ -97,7 +115,7 @@ __bswap32(__uint32_t _x)
> }
>
> static __inline __uint64_t
> -__bswap64(__uint64_t _x)
> +__bswap64_var(__uint64_t _x)
> {
>
> 	return ((_x >> 56) | ((_x >> 40) & 0xff00) | ((_x >> 24) & 0xff0000) |
> @@ -106,20 +124,16 @@ __bswap64(__uint64_t _x)
> 	    ((_x << 40) & ((__uint64_t)0xff << 48)) | ((_x << 56)));
> }

All the functions are MI too, since they don't use asm.

All the functions use essentially the same as the "constant" or "optimized"
code, again since they don't use asm.  They should use the macros instead
of repeating them.  All they do better than the macros is supply safety
(avoid evalating args more than once).  But functions are not needed for
this -- old i386 macros were safe at the cost of using a gcc extension
(statement-expression) to be both safe and an expression.  i386 now only
uses 1 MD macro, for 32-bit swaps.  It doesn't support 64-bit swaps.
"constant"/"optimized" 16-bit swaps are apparently not needed, since i386
only has an inline function for them (the asm "optimization" for this case
and perhaps the constant "optimization" for this case were removed a year
or 2 ago).  The function versions also convert the arg in the correct way
(only once via the prototype), where the macros now have cast for each use
of the arg, and used to be broken since they were missing some casts.

IIRC, it was cleaning up the last 2 points that cognet@ mad most progress
in.

>
> +#define	__bswap16(x)	(__is_constant(x) ? __bswap16_const(x) : \
> +	__bswap16_var(x))
> +#define	__bswap32(x)	(__is_constant(x) ? __bswap32_const(x) : \
> +	__bswap32_var(x))
> +#define	__bswap64(x)	(__is_constant(x) ? __bswap64_const(x) : \
> +	__bswap64_var(x))
> +

Unless the functions are in asm, this should be (in MI code):

#define	__bswap16(x)	__bswap16_generic(x)

where __bswap_16_generic() is a macro, etc., except you probably wouldn't
even have another layer of macros for it.

sys/endian.h does all of this quite differently.  It doesn't attempt
any optimizations, except where it uses MD functions that might have them.

Bruce