Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 25 Mar 2012 16:31:29 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Dimitry Andric <dim@FreeBSD.org>
Cc:        svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org, tijl@FreeBSD.org, Bruce Evans <brde@optusnet.com.au>
Subject:   Re: svn commit: r233419 - head/sys/x86/include
Message-ID:  <20120325150057.H933@besplex.bde.org>
In-Reply-To: <4F6E3A7D.9020709@FreeBSD.org>
References:  <201203241007.q2OA7MtS024789@svn.freebsd.org> <20120325040224.Q2467@besplex.bde.org> <4F6E3A7D.9020709@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 24 Mar 2012, Dimitry Andric wrote:

> On 2012-03-24 18:48, Bruce Evans wrote:
>> On Sat, 24 Mar 2012, Dimitry Andric wrote:
>>
>>> Log:
>>>  Fix the following clang warning in sys/dev/dcons/dcons.c, caused by the
>>>  recent changes in sys/x86/include/endian.h:
>>>
>>>    sys/dev/dcons/dcons.c:190:15: error: implicit conversion from '__uint32_t' (aka 'unsigned int') to '__uint16_t' (aka 'unsigned short') changes value from 1684238190 to 28526 [-Werror,-Wconstant-conversion]
>>>
>>>  	  buf->magic = ntohl(DCONS_MAGIC);
>>>  		       ^~~~~~~~~~~~~~~~~~
>>>    sys/sys/param.h:306:18: note: expanded from:
>>>    #define ntohl(x)        __ntohl(x)
>>>  			  ^
>>>    ./x86/endian.h:128:20: note: expanded from:
>>>    #define __ntohl(x)      __bswap32(x)
>>>  			  ^
>>>    ./x86/endian.h:78:20: note: expanded from:
>>>  	      __bswap32_gen((__uint32_t)(x)) : __bswap32_var(x))
>>>  			    ^
>>>    ./x86/endian.h:68:26: note: expanded from:
>>>  	  (((__uint32_t)__bswap16(x) << 16) | __bswap16((x) >> 16))
>>>  				  ^
>>>    ./x86/endian.h:75:53: note: expanded from:
>>>  	      __bswap16_gen((__uint16_t)(x)) : __bswap16_var(x)))
>>>  					       ~~~~~~~~~~~~~ ^
>>
>> This warning was discussed before things were committed.
>
> Then why was it committed without fixing the warning first?

Because testing (by tijl) showed that there was no warning.

Testing now (by me) shows the warning for dcons.

>> gcc gives a
>> similar warning, but according to tijl, it can't happen in normal use
>> with either gcc or clang because -Wno-system-headers suppresses warnings
>> in system headers.  -Wnosystem-headers is the default in the kernel,
>> where the above problem happens, so I don't see how it happens.
>
> I don't see that option anywhere in either the world or kernel build.

It is the default for gcc.  It seems to be the default for clang too.

> Why would you not want to warn about system headers?

We do want to warn about system headers.  That's why we add
-Wsystem-headers to CFLAGS in bsd.sys.mk if WARNS >= 1 (to change the
default).  But WARNS is not supported for kernels, and is not normally
set for modules.

>> However, bsd.sys.mk forces the non-default of -Wsystem-headers in
>> userland at WARNS >= 1, so I don't see why there isn't a problem in
>> userland if userland actually uses an expression like
>> ntohl(DCONS_MAGIC).
>
> It only occurs in certain cases, and I have not been able to exactly
> determine when.  It doesn't seem to be related to -Wsystem-headers.

-Wsystem-headers operates mysteriously.  It is reported to not work for
clang, but it works for me in _not_ suppressing the warning for
(x && y || z && t), yet not for the ntohl() warning.  What a system
header is undocumented.  It seems to be related to using the standard
search path: `-I.' to reach <sys> doesn't work, but '-I.' to reach x86
does work.  So I was able to test ./x86/endian.h as a "system header"
and put garbage in it, and got strange results:
- when ./x86/endian.h is a "system" header, clang never warns about
   the ntohl(), irrespective of -Wsystem-headers
- when ./x86/endian.h is not a "system" header, clang always warns about
   the ntohl()
- gcc never warns about the ntohl()
- when ./x86/endian.h is a "system" header, with -Wall both gcc and clang
   warns about (x && y || z && t) according to -Wsystem-headers
- when ./x86/endian.h is not a "system" header, with -Wall both gcc and
   clang always warn about (x && y || z && t)
- when ./x86/endian.h is a "system" header, with -Wimplicit-int clang
   warns about 'x(void);' according to -Wsystem-headers, but gcc never
   warns about it.
- when ./x86/endian.h is not a "system" header, with -Wall both gcc and
   clang always warn about 'x(void);'.

>>>  This is because the __bswapXX_gen() macros (for x86) call the regular
>>>  __bswapXX() macros.  Since the __bswapXX_gen() variants are only called
>>>  when their arguments are constant, there is no need to do that constancy
>>>  check recursively.  Also, it causes the above error with clang.
>>
>> Not true.  The __bswapXX_gen() are called with non-constant args from
>> __bswap64_var() in the i386 case.
>
> Sure, but there it doesn't matter at all.
>
>> As documented there, it is important

Of course it matters there.  It is the difference between gcc generating
2 32-bit bswap instructions for bswap64(var) and 25 lines of shifts and
mask instructions.

>>>  This is because the __bswapXX_gen() macros (for x86) call the regular
>>>  __bswapXX() macros.  Since the __bswapXX_gen() variants are only called
>>>  when their arguments are constant, there is no need to do that constancy
>>>  check recursively.  Also, it causes the above error with clang.
>>
>> Not true.  The __bswapXX_gen() are called with non-constant args from
>> __bswap64_var() in the i386 case.
>
> Sure, but there it doesn't matter at all.

Of course it matters.

>> As documented there, it is important
>> for optimizations that __bswap64_gen() does do the constancy check
>> recursively.  This was discussed before things were committed, and
>> again when jhb siggested breaking it similarly to this commit to avoid
>> a problem on ia64.
>
> Isn't it much easier, and less obtuse, to just jam in two bswaps for the
> i386?  I never saw the reason for making the bswap macros *more*
> complicated instead of less.  If you obfuscate them enough, no compiler
> will be able to optimize properly... :)

No, since the divison of 64 bits into 2 times 32-bits is very easy.
For constants it is much simpler than what it replaced, and the same
method is now used for variables.  Hard-coding 2 32-bit bswaps for
i386 would give a different method for i386, and if done in asm would
inhibit the compiler combining these bswaps.  Although we do have the
corresponding hard asm for bswap32() (gcc needs it but clang doesn't;
clang needs help only for bswap64()).

>>>  Fix it by calling __bswap16_gen() from __bswap32_gen(), and similarly,
>>>  __bswap32_gen() from  __bswap64_gen().
>>
>> This breaks it.  DIring development, I had a correct fix involving
>> casting down the arg.
>
> If it works correctly with clang, please post it.

This mail contains an intermediate version and discusses and fixes this
bug, but is out of date in many respects:

@ From bde@optusnet.com.au Fri Mar  2 22:52:44 2012 +1100
@ Date: Fri, 2 Mar 2012 22:52:41 +1100 (EST)
@ From: Bruce Evans <brde@optusnet.com.au>
@ X-X-Sender: bde@besplex.bde.org
@ To: Bruce Evans <brde@optusnet.com.au>
@ cc: Tijl Coosemans <tijl@freebsd.org>, Dimitry Andric <dim@freebsd.org>, 
@     src-committers@freebsd.org, svn-src-all@freebsd.org, 
@     svn-src-head@freebsd.org
@ Subject: Re: svn commit: r232266 - in head/sys: amd64/include i386/include
@  pc98/include x86/include
@ In-Reply-To: <20120302130435.J929@besplex.bde.org>
@ Message-ID: <20120302223355.Q2543@besplex.bde.org>
@ References: <201202281939.q1SJdtYr084858@svn.freebsd.org>
@  <20120229160522.W2514@besplex.bde.org> <20120229181608.S2887@besplex.bde.org>
@  <201203012346.20648.tijl@freebsd.org> <20120302130435.J929@besplex.bde.org>
@ MIME-Version: 1.0
@ Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
@ Status: RO
@ X-Status: 
@ X-Keywords: 
@ X-UID: 3391
@ 
@ On Fri, 2 Mar 2012, Bruce Evans wrote:
@ 
@ > On Thu, 1 Mar 2012, Tijl Coosemans wrote:
@ >
@ >> On Wednesday 29 February 2012 08:44:54 Bruce Evans wrote:
@ >>> ...
@ >>> So any macro version must use gcc features to be safe.  The following
@ >>> seems to work:
@ >>> 
@ >>> #define	__bswap16(x)	__extension__ ({ __uint16_t __x = x;
@ >>>  	(__uint16_t)(__x << 8 | __x >> 8); })
@ >>> 
@ >>> clang now produces "rolw $8,x" when x is volatile.  This seems to
@ >>> violate volatile semantics.  gcc produces load-rolw-store then.  Both
@ >>> produce "rolw $8,x" when x is not volatile.
@ >> 
@ >> I'll have a closer look at the patch tomorrow, but the Intel
@ >> documentation for the bswap instruction recommends to use xchg for
@ >> bswap16.
@ >
@ > That's what i386 used to do, using an asm (see for example the RELENG_4
@ > version), but the asm was replaced by C code and the compiler apparently
@ > knows better.  This might depend on -mtune.  I tested with the default,
@ > ...
@ 
@ Here is another version.  It has now been tested a bit at runtime.
@ I had to restore almost all of the complications, so the following
@ reduces to mostly style changes including comments about surprising
@ details.  It has 1 fix for clang (the comitted version doesn't
@ work for clang).
@ 
@ % Index: endian.h
@ % ===================================================================
@ % RCS file: /home/ncvs/src/sys/x86/include/endian.h,v
@ % retrieving revision 1.1
@ % diff -u -2 -r1.1 endian.h
@ % --- endian.h	28 Feb 2012 19:39:54 -0000	1.1
@ % +++ endian.h	2 Mar 2012 10:43:53 -0000
@ % @@ -37,8 +37,4 @@
@ %  #include <sys/_types.h>
@ % 
@ % -#ifdef __cplusplus
@ % -extern "C" {
@ % -#endif
@ % -
@ %  /*
@ %   * Define the order of 32-bit words in 64-bit words.
@ % @@ -68,25 +64,17 @@
@ %  #endif
@ % 
@ % -#if defined(__GNUCLIKE_ASM) && defined(__GNUCLIKE_BUILTIN_CONSTANT_P)
@ % -
@ % -#define	__bswap16_const(_x)	(__uint16_t)((_x) << 8 | (_x) >> 8)
@ % +#define	__bswap16_gen(x)	(__uint16_t)((x) << 8 | (x) >> 8)
@ % +#define	__bswap32_gen(x)		\
@ % +	(((__uint32_t)__bswap16(x) << 16) | __bswap16((x) >> 16))
@ % +#define	__bswap64_gen(x)		\
@ % +	(((__uint64_t)__bswap32(x) << 32) | __bswap32((x) >> 32))
@ % 
@ % -#define	__bswap16(_x)			\
@ % -	(__builtin_constant_p(_x) ?	\
@ % -	    __bswap16_const((__uint16_t)(_x)) : __bswap16_var(_x))
@ % -
@ % -#define	__bswap32_const(_x)		\
@ % -	(((__uint32_t)__bswap16(_x) << 16) | __bswap16((_x) >> 16))
@ % -
@ % -#define	__bswap32(_x)			\
@ % -	(__builtin_constant_p(_x) ?	\
@ % -	    __bswap32_const((__uint32_t)(_x)) : __bswap32_var(_x))
@ % +#if defined(__GNUCLIKE_ASM) && defined(__GNUCLIKE_BUILTIN_CONSTANT_P)
@ % 
@ % -#define	__bswap64_const(_x)		\
@ % -	(((__uint64_t)__bswap32(_x) << 32) | __bswap32((_x) >> 32))
@ % +/* The following mess is to avoid multiple evaluation of x. */
@ % 
@ % -#define	__bswap64(_x)			\
@ % -	(__builtin_constant_p(_x) ?	\
@ % -	    __bswap64_const((__uint64_t)(_x)) : __bswap64_var(_x))
@ % +#define	__bswap16(x)			\
@ % +	(__builtin_constant_p(x) ?	\
@ % +	    __bswap16_gen((__uint16_t)(x)) : __bswap16_var((__uint16_t)(x)))
@ 
@ For clang, x must be cast in the `var' clause too, since when x is
@ something like 0x12345678 (having been passed here without conversion
@ by __bswap32()), clang parses the `var' clause and warns about the
@ implicit truncation of 0x12345678 to 0x5678 in it.  This is a bug
@ in clang -- when warning about fine details, it is important not to
@ do it for non-problems.
@ 
@ % 
@ %  static __inline __uint16_t
@ % @@ -94,7 +82,19 @@
@ %  {
@ % 
@ % -	return (__bswap16_const(_x));
@ % +	return (__bswap16_gen(_x));
@ %  }
@ % 
@ % +/*
@ % + * The following messes are old optimizations for gcc.  The messes have
@ % + * been reduced significantly, but I don't know how to implement the
@ % + * preferred way of defining defaults above and overriding them cleanly
@ % + * here.  Even clang needs help for the 64-bit case, so to keep the
@ % + * ifdefs sane we use this for the 32-bit case with clang too.
@ % + */
@ % +
@ % +#define	__bswap32(x)			\
@ % +	(__builtin_constant_p(x) ?	\
@ % +	    __bswap32_gen((__uint32_t)(x)) : __bswap32_var(x))
@ % +
@ %  static __inline __uint32_t
@ %  __bswap32_var(__uint32_t _x)
@ % @@ -105,34 +105,37 @@
@ %  }
@ % 
@ % +#define	__bswap64(x)			\
@ % +	(__builtin_constant_p(x) ?	\
@ % +	    __bswap64_gen((__uint64_t)(x)) : __bswap64_var(x))
@ % +
@ %  static __inline __uint64_t
@ %  __bswap64_var(__uint64_t _x)
@ %  {
@ % -#ifdef	_LP64
@ % +
@ % +#ifdef _LP64
@ %  	__asm ("bswap %0" : "+r" (_x));
@ %  	return (_x);
@ %  #else
@ % -	return (__bswap64_const(_x));
@ % +	/*
@ % +	 * It is important for the optimizations that the following is not
@ % +	 * really generic, but expands to 2 __bswap32_var()'s.
@ % +	 */
@ % +	return (__bswap64_gen(_x));
@ %  #endif
@ %  }
@ % 
@ % -#define	__htonl(x)	__bswap32(x)
@ % -#define	__htons(x)	__bswap16(x)
@ % -#define	__ntohl(x)	__bswap32(x)
@ % -#define	__ntohs(x)	__bswap16(x)
@ % +#else /* !(__GNUCLIKE_ASM && __GNUCLIKE_BUILTIN_CONSTANT_P */
@ % 
@ % -#else /* !(__GNUCLIKE_ASM && __GNUCLIKE_BUILTIN_CONSTANT_P) */
@ % -
@ % -/*
@ % - * No optimizations are available for this compiler.  Fall back to
@ % - * non-optimized functions by defining the constant usually used to prevent
@ % - * redefinition.
@ % - */
@ % -#define	_BYTEORDER_FUNC_DEFINED
@ % +/* XXX these are broken, since they evaluate x more than once. */
@ % +#define	__bswap16(x)	(__bswap16_gen((__uint16_t)(x)))
@ % +#define	__bswap32(x)	(__bswap32_gen((__uint32_t)(x)))
@ % +#define	__bswap64(x)	(__bswap64_gen((__uint64_t)(x)))
@ % 
@ %  #endif /* __GNUCLIKE_ASM && __GNUCLIKE_BUILTIN_CONSTANT_P */
@ % 
@ % -#ifdef __cplusplus
@ % -}
@ % -#endif
@ % +#define	__htonl(x)	__bswap32(x)
@ % +#define	__htons(x)	__bswap16(x)
@ % +#define	__ntohl(x)	__bswap32(x)
@ % +#define	__ntohs(x)	__bswap16(x)
@ % 
@ %  #endif /* !_MACHINE_ENDIAN_H_ */
@ 
@ Here is a cut down version showing the clang bug:
@ 
@ % static inline int
@ % unused(short x)
@ % {
@ % 	return (x);
@ % }
@ % 
@ % int
@ % foo(void)
@ % {
@ % 	return (1 ? 0 : unused(0x12345678));
@ % }
@ 
@ % w.c:10:25: warning: implicit conversion from 'int' to 'short' changes value from 305419896 to 22136 [-Wconstant-conversion]
@ %         return (1 ? 0 : unused(0x12345678));
@ %                         ~~~~~~ ^~~~~~~~~~
@ % 1 warning generated.
@ 
@ This shows another bug: the hex constants are obfuscated in the error message
@ by printing them in decimal.
@ 
@ Bruce

The fix is the obvious one of casting args to __bswapNN_gen() to size NN,
so that 0x12345678 becomes 0x5678.

>>>  While here, add extra parentheses around the __bswap16_gen() macro
>>>  expansion, to prevent unexpected side effects.
>>
>> These parentheses were intentionally left out for clarity.
>> __bswap16_gen() is not user-serving, and all places in endian.h that
>> use it supply enough parentheses.
>
> Let's hope so, until a follow up commit breaks it again.

There are also a couple of '(x)'s that don't need parentheses.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120325150057.H933>