From owner-svn-src-all@FreeBSD.ORG Sun Mar 25 20:27:58 2012 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4FE7E106566B; Sun, 25 Mar 2012 20:27:58 +0000 (UTC) (envelope-from tijl@freebsd.org) Received: from mailrelay009.isp.belgacom.be (mailrelay009.isp.belgacom.be [195.238.6.176]) by mx1.freebsd.org (Postfix) with ESMTP id 9ECB48FC1A; Sun, 25 Mar 2012 20:27:56 +0000 (UTC) X-Belgacom-Dynamic: yes X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AjgFAL1+b09bsVzs/2dsb2JhbABCtQ2DF4EIggkBAQQBJy8jEAsOCi45HgYBiBcJt3eKYoZGBIgkiViUEYJpgVI Received: from 236.92-177-91.adsl-dyn.isp.belgacom.be (HELO kalimero.tijl.coosemans.org) ([91.177.92.236]) by relay.skynet.be with ESMTP; 25 Mar 2012 22:26:45 +0200 Received: from kalimero.tijl.coosemans.org (kalimero.tijl.coosemans.org [127.0.0.1]) by kalimero.tijl.coosemans.org (8.14.5/8.14.5) with ESMTP id q2PKQhXa004352; Sun, 25 Mar 2012 22:26:44 +0200 (CEST) (envelope-from tijl@freebsd.org) From: Tijl Coosemans To: Bruce Evans , Dimitry Andric Date: Sun, 25 Mar 2012 22:26:36 +0200 User-Agent: KMail/1.13.7 (FreeBSD/10.0-CURRENT; KDE/4.7.3; i386; ; ) References: <201203241007.q2OA7MtS024789@svn.freebsd.org> <4F6E3A7D.9020709@FreeBSD.org> <20120325150057.H933@besplex.bde.org> In-Reply-To: <20120325150057.H933@besplex.bde.org> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart2186372.ztlCrlZhdp"; protocol="application/pgp-signature"; micalg=pgp-sha256 Content-Transfer-Encoding: 7bit Message-Id: <201203252226.41883.tijl@freebsd.org> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, John Baldwin Subject: Re: svn commit: r233419 - head/sys/x86/include X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Mar 2012 20:27:58 -0000 --nextPart2186372.ztlCrlZhdp Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable On Sunday 25 March 2012 07:31:29 Bruce Evans wrote: > On Sat, 24 Mar 2012, Dimitry Andric wrote: >> On 2012-03-24 18:48, Bruce Evans wrote: >>> On Sat, 24 Mar 2012, Dimitry Andric wrote: >>>> Log: >>>> Fix the following clang warning in sys/dev/dcons/dcons.c, caused by t= he >>>> recent changes in sys/x86/include/endian.h: >>>> >>>> sys/dev/dcons/dcons.c:190:15: error: implicit conversion from '__ui= nt32_t' (aka 'unsigned int') to '__uint16_t' (aka 'unsigned short') changes= value from 1684238190 to 28526 [-Werror,-Wconstant-conversion] >>>> >>>> buf->magic =3D ntohl(DCONS_MAGIC); >>>> ^~~~~~~~~~~~~~~~~~ >>>> sys/sys/param.h:306:18: note: expanded from: >>>> #define ntohl(x) __ntohl(x) >>>> ^ >>>> ./x86/endian.h:128:20: note: expanded from: >>>> #define __ntohl(x) __bswap32(x) >>>> ^ >>>> ./x86/endian.h:78:20: note: expanded from: >>>> __bswap32_gen((__uint32_t)(x)) : __bswap32_var(x)) >>>> ^ >>>> ./x86/endian.h:68:26: note: expanded from: >>>> (((__uint32_t)__bswap16(x) << 16) | __bswap16((x) >> 16)) >>>> ^ >>>> ./x86/endian.h:75:53: note: expanded from: >>>> __bswap16_gen((__uint16_t)(x)) : __bswap16_var(x))) >>>> ~~~~~~~~~~~~~ ^ >>> >>> This warning was discussed before things were committed. >> >> Then why was it committed without fixing the warning first? >=20 > Because testing (by tijl) showed that there was no warning. To clarify, I didn't do a full clang build, just some test programs. >>> As documented there, it is important >>> for optimizations that __bswap64_gen() does do the constancy check >>> recursively. This was discussed before things were committed, and >>> again when jhb siggested breaking it similarly to this commit to avoid >>> a problem on ia64. >> >> Isn't it much easier, and less obtuse, to just jam in two bswaps for the >> i386? I never saw the reason for making the bswap macros *more* >> complicated instead of less. If you obfuscate them enough, no compiler >> will be able to optimize properly... :) >=20 > No, since the divison of 64 bits into 2 times 32-bits is very easy. > For constants it is much simpler than what it replaced, and the same > method is now used for variables. Hard-coding 2 32-bit bswaps for > i386 would give a different method for i386, and if done in asm would > inhibit the compiler combining these bswaps. Although we do have the > corresponding hard asm for bswap32() (gcc needs it but clang doesn't; > clang needs help only for bswap64()). Are you sure about clang not needing the asm? I cannot get it to turn the generic macro into a bswap instruction. GCC 4.7 does seem to be able to do that at -O2 and higher (if I add extra casts). >>>> Fix it by calling __bswap16_gen() from __bswap32_gen(), and similarly, >>>> __bswap32_gen() from __bswap64_gen(). >>> >>> This breaks it. DIring development, I had a correct fix involving >>> casting down the arg. >> >> If it works correctly with clang, please post it. I've been experimenting a bit with various compilers over the weekend. The following is an incomplete/untested patch, but feel free to comment on it. It moves the __bswapNN definitions to sys/_endian.h. Only asm versions would stay in machine/endian.h. The patch also adds support for __builtin_bswap32/64 to cdefs.h. Only x86/endian.h has been converted here and not all places that include machine/endian.h have been checked yet. diff --git a/include/arpa/inet.h b/include/arpa/inet.h index 9713e7f..2a296ca 100644 =2D-- a/include/arpa/inet.h +++ b/include/arpa/inet.h @@ -61,10 +61,9 @@ /* External definitions for functions in inet(3). */ =20 #include +#include #include =20 =2D/* Required for byteorder(3) functions. */ =2D#include =20 #define INET_ADDRSTRLEN 16 #define INET6_ADDRSTRLEN 46 diff --git a/sys/sys/cdefs.h b/sys/sys/cdefs.h index 076842d..0ddab2d 100644 =2D-- a/sys/sys/cdefs.h +++ b/sys/sys/cdefs.h @@ -44,6 +44,16 @@ #define __END_DECLS #endif =20 +#ifndef __has_feature +#define __has_feature(x) 0 +#endif +#ifndef __has_include +#define __has_include(x) 0 +#endif +#ifndef __has_builtin +#define __has_builtin(x) 0 +#endif + /* * This code has been put in place to help reduce the addition of * compiler specific defines in FreeBSD code. It helps to aid in @@ -293,6 +303,14 @@ #define __nonnull(x) #endif =20 +#if __has_builtin(__builtin_bswap32) || __GNUC_PREREQ__(4, 3) +#define __bswap32(x) __builtin_bswap32(x) +#define __bswap64(x) __builtin_bswap64(x) +#else +#undef __bswap32 +#undef __bswap64 +#endif + /* XXX: should use `#if __STDC_VERSION__ < 199901'. */ #if !__GNUC_PREREQ__(2, 7) && !defined(__INTEL_COMPILER) #define __func__ NULL @@ -647,16 +665,6 @@ #endif #endif =20 =2D#ifndef __has_feature =2D#define __has_feature(x) 0 =2D#endif =2D#ifndef __has_include =2D#define __has_include(x) 0 =2D#endif =2D#ifndef __has_builtin =2D#define __has_builtin(x) 0 =2D#endif =2D #if defined(__mips) || defined(__powerpc64__) #define __NO_TLS 1 #endif diff --git a/sys/sys/endian.h b/sys/sys/endian.h index d50110c..6c2aa54 100644 =2D-- a/sys/sys/endian.h +++ b/sys/sys/endian.h @@ -30,8 +30,8 @@ #define _SYS_ENDIAN_H_ =20 #include +#include #include =2D#include =20 #ifndef _UINT8_T_DECLARED typedef __uint8_t uint8_t; diff --git a/sys/sys/_endian.h b/sys/sys/_endian.h new file mode 100644 index 0000000..43973d2 =2D-- /dev/null +++ b/sys/sys/_endian.h @@ -0,0 +1,101 @@ +/*- + * Copyright (c) 1987, 1991 Regents of the University of California. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 4. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURP= OSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENT= IAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STR= ICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY W= AY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * @(#)endian.h 7.8 (Berkeley) 4/3/91 + * $FreeBSD$ + */ + +#ifndef _SYS__ENDIAN_H_ +#define _SYS__ENDIAN_H_ + +#include +#include + +#include + +#ifdef __GNUCLIKE_BUILTIN_CONSTANT_P +#define __isconst(x) __builtin_constant_p(x) +#else +#define __isconst(x) 0 +#endif isconst(x) should probably be defined in cdefs.h. + +#ifndef __bswap16 +#define __bswap16_gen(x) \ + ((__uint16_t)((__uint16_t) \ + ((__uint16_t)(x) << 8) | (__uint16_t)(x) >> 8)) MI version of __bswap16. The first cast is the return type. The second helps all versions of gcc that I've tested to optimise __bswap32 and __bswap64 better (probably because the expression is promoted to int and the cast tells the compiler to discard the upper 16 bits). The third and fourth simply cast the macro argument. It makes the macro more function-like. All macros are like this now. +static __inline __uint16_t +__bswap16(__uint16_t __x) +{ +#ifdef __bswap16_asm + __bswap16_asm(__x); + return (__x); +#else + return (__bswap16_gen(__x)); +#endif +} If machine/endian.h defines __bswap16_asm, use it, otherwise use the MI version. machine/endian.h should not define __bswap16_asm if the compiler can optimise the C expression correctly. +#define __bswap16(x) \ + ((__uint16_t)(__isconst(x) ? __bswap16_gen(x) : __bswap16(x))) Allow using __bswap16 in static initialisers. +#endif /* ! __bswap16 */ + +#ifndef __bswap32 If __bswap32 is defined in cdefs.h, don't do anything here. +#define __bswap32_gen(x) \ + ((__uint32_t)((__uint32_t) \ + ((__uint32_t)__bswap16(x) << 16) | __bswap16((__uint32_t)(x) >> 16))) __bswap32 implemented using __bswap16. The second cast doesn't seem to be necessary here. I kept it just to be sure and to keep the definition similar to __bswap16_gen. +static __inline __uint32_t +__bswap32(__uint32_t __x) +{ +#ifdef __bswap32_asm + __bswap32_asm(__x); + return (__x); +#else + return (__bswap32_gen(__x)); +#endif +} Same as __bswap16. +#define __bswap32(x) \ + ((__uint32_t)(__isconst(x) ? __bswap32_gen(x) : __bswap32(x))) Same as __bswap16. +#endif /* ! __bswap32 */ + +#ifndef __bswap64 +#define __bswap64_gen(x) \ + ((__uint64_t)((__uint64_t) \ + ((__uint64_t)__bswap32(x) << 32) | __bswap32((__uint64_t)(x) >> 32))) +static __inline __uint64_t +__bswap64(__uint64_t __x) +{ +#ifdef __bswap64_asm + __bswap64_asm(__x); + return (__x); +#else + return (__bswap64_gen(__x)); +#endif +} +#define __bswap64(x) \ + ((__uint64_t)(__isconst(x) ? __bswap64_gen(x) : __bswap64(x))) +#endif /* ! __bswap64 */ Same as __bswap32. +#endif /* !_SYS__ENDIAN_H_ */ diff --git a/sys/x86/include/endian.h b/sys/x86/include/endian.h index 9059587..7827231 100644 =2D-- a/sys/x86/include/endian.h +++ b/sys/x86/include/endian.h @@ -63,65 +63,12 @@ #define BYTE_ORDER _BYTE_ORDER #endif =20 =2D#define __bswap16_gen(x) ((__uint16_t)((x) << 8 | (x) >> 8)) =2D#define __bswap32_gen(x) \ =2D (((__uint32_t)__bswap16_gen(x) << 16) | __bswap16_gen((x) >> 16)) =2D#define __bswap64_gen(x) \ =2D (((__uint64_t)__bswap32_gen(x) << 32) | __bswap32_gen((x) >> 32)) =2D =2D#ifdef __GNUCLIKE_BUILTIN_CONSTANT_P =2D#define __bswap16(x) \ =2D ((__uint16_t)(__builtin_constant_p(x) ? \ =2D __bswap16_gen((__uint16_t)(x)) : __bswap16_var(x))) =2D#define __bswap32(x) \ =2D (__builtin_constant_p(x) ? \ =2D __bswap32_gen((__uint32_t)(x)) : __bswap32_var(x)) =2D#define __bswap64(x) \ =2D (__builtin_constant_p(x) ? \ =2D __bswap64_gen((__uint64_t)(x)) : __bswap64_var(x)) =2D#else =2D/* XXX these are broken for use in static initializers. */ =2D#define __bswap16(x) __bswap16_var(x) =2D#define __bswap32(x) __bswap32_var(x) =2D#define __bswap64(x) __bswap64_var(x) =2D#endif =2D =2D/* These are defined as functions to avoid multiple evaluation of x. */ =2D =2Dstatic __inline __uint16_t =2D__bswap16_var(__uint16_t _x) =2D{ =2D =2D return (__bswap16_gen(_x)); =2D} =2D =2Dstatic __inline __uint32_t =2D__bswap32_var(__uint32_t _x) =2D{ =2D #ifdef __GNUCLIKE_ASM =2D __asm("bswap %0" : "+r" (_x)); =2D return (_x); =2D#else =2D return (__bswap32_gen(_x)); +#define __bswap32_asm(x) __asm("bswapl %0" : "+r" (x)) +#ifdef __amd64__ +#define __bswap64_asm(x) __asm("bswapq %0" : "+r" (x)) #endif Remove everything about __bswapNN and just keep _asm versions. Because clang and gcc>=3D4.3 have __builtin_bswapNN these are only used by base system gcc and older versions of gcc. =2D} =2D =2Dstatic __inline __uint64_t =2D__bswap64_var(__uint64_t _x) =2D{ =2D =2D#if defined(__amd64__) && defined(__GNUCLIKE_ASM) =2D __asm("bswap %0" : "+r" (_x)); =2D return (_x); =2D#else =2D /* =2D * It is important for the optimizations that the following is not =2D * really generic, but expands to 2 __bswap32_var()'s. =2D */ =2D return (__bswap64_gen(_x)); #endif =2D} =20 #define __htonl(x) __bswap32(x) #define __htons(x) __bswap16(x) I tried to make the implementation as simple as I could, but obviously it is still complicated, so suggestions to improve this are welcome. --nextPart2186372.ztlCrlZhdp Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (FreeBSD) iF4EABEIAAYFAk9vf4EACgkQfoCS2CCgtivp1AD7BMuEE3Sjy6yT/iimR6kSqzPH PNsVxzmRYCkPdCNGVKMA+QHHsmjBRmsqvysSQUGy3ZTOz66StaEbVZz3NROKCqhQ =wJYj -----END PGP SIGNATURE----- --nextPart2186372.ztlCrlZhdp--