From owner-svn-src-head@freebsd.org Sun May 22 01:29:58 2016 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E30A5B3BBF4; Sun, 22 May 2016 01:29:58 +0000 (UTC) (envelope-from torek@torek.net) Received: from elf.torek.net (mail.torek.net [96.90.199.121]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CD28A1F30; Sun, 22 May 2016 01:29:58 +0000 (UTC) (envelope-from torek@torek.net) Received: from elf.torek.net (localhost [127.0.0.1]) by elf.torek.net (8.14.9/8.14.9) with ESMTP id u4M1Topw010808; Sat, 21 May 2016 18:29:51 -0700 (PDT) (envelope-from torek@torek.net) Message-Id: <201605220129.u4M1Topw010808@elf.torek.net> From: Chris Torek To: Bruce Evans cc: Conrad Meyer , Konstantin Belousov , src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r300332 - in head/sys: amd64/amd64 i386/i386 In-reply-to: Your message of "Sun, 22 May 2016 10:58:05 +1000." <20160522101943.U1190@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <10806.1463880590.1@elf.torek.net> Date: Sat, 21 May 2016 18:29:50 -0700 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (elf.torek.net [127.0.0.1]); Sat, 21 May 2016 18:29:51 -0700 (PDT) X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 May 2016 01:29:59 -0000 >> Can you explain a little bit about the badly behaved ordering of >> unsigned integers? I am not familiar with that. > >The strongest ordering properties for real numbers depend on the existence >of negative numbers (and zero). E.g., x >= y if and only if x - y >= 0. >To apply that, you need the more basic property that the ordering keeps >negative numbers separate from strictly positive numbers and zero. > >Without negative numbers, we can hope for weaker properties. One is >that x1 <= x2 implies x1 + y <= x2 + y. The is true for C signed and >unsigned integers if there is no overflow, but for the unsigned case >overflow is often considered normal and is technically not described >as overflow. On the other hand, since most C compilers don't bother to trap signed integer overflow, but some can, signed integers may behave just as badly. :-) Overall I personally find the rules simpler for unsigned integers (occasionally surprising, but predictable and provable behavior in the mod-2^k ring) than for signed integers (occasionally surprising, possible trap on overflow, possible nonsense on overflow, unpredictable and hence unprovable in general). The ANSI C folks in 1989 made a mess with the "value preserving" rules where unsigned integers become signed integers if the widened type is capable of representing all the values of the narrower type, but become wider unsigned integers if the widened type is not capable of representing all these values. Even restricting operation to two's complement, 8-bit-byte, conventional systems, this means we have several realistic cases: * 16-bit int, 32-bit long, 64-bit long long ("I16L32"): unsigned char widens to signed int, but unsigned short widens to unsigned int. (This model is does not run BSD but is still used in some embedded systems.) * 32-bit int, 32-bit long, 64-bit long long ("IL32"): unsigned char and unsigned short widen to signed int; unsigned int stays unsigned. Mixing unsigned int or unsigned long with signed long long gives you signed behavior. * 32-bit int, 64-bit long, 64-bit long long ("I32L64"): mostly behaves like IL32, but mixing unsigned long with signed long long gives you unsigned behavior. The byte length of pointers may be any of these, and the short-hand notation names usually have a "P" in there, e.g., ILP32 means all are 32 bit, I32LP64 means 32-bit int but 64-bit long and pointers, and so on. Exotic machines with variable-length or variable-format pointers (depending on the data type) are rarer now, although some still make code pointers much longer than data pointers. (Some bypass the problem by using data pointers to descriptors instead of raw code pointers. That is, for void (*fp)(), fp need not point directly to the code to run: it can point instead to a data descriptor that may include both a raw code address and some sort of context, for instance.) --- Ultimately, assuming "i" and "limit" are (a) both signed, or have the same type except that "limit" is unsigned, and (b) "limit" is sane (is nonnegative), using: if (i >= 0 && i < limit) and: if ((unsigned T)i < (unsigned T)limit) do the same thing. But the second form obviously requires knowing what type-name T to insert, and knowing something about "limit" (that it is nonnegative). It used to generate significantly better code to write just the one unsigned-cast test, but these days it's better to just spell out the ">= 0" and let the compiler optimize when possible. Chris