From owner-freebsd-current Sat Dec 14 2: 2: 9 2002 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2354B37B401; Sat, 14 Dec 2002 02:02:07 -0800 (PST) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6AB1543E4A; Sat, 14 Dec 2002 02:02:05 -0800 (PST) (envelope-from bde@zeta.org.au) Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id VAA29785; Sat, 14 Dec 2002 21:01:55 +1100 Date: Sat, 14 Dec 2002 21:02:40 +1100 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Ruslan Ermilov Cc: "Andrey A. Chernov" , , , Subject: Re: New AWK bug with collating In-Reply-To: <20021213150942.GE86638@sunbay.com> Message-ID: <20021214202522.L5768-100000@gamplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Fri, 13 Dec 2002, Ruslan Ermilov wrote: > On Fri, Dec 13, 2002 at 04:41:06PM +0300, Andrey A. Chernov wrote: > > On Fri, Dec 13, 2002 at 14:32:40 +0200, Ruslan Ermilov wrote: > > > Pardon my ignorance here, but the following fragment > > > returns -1, doesn't it? > > > > > > #include > > > void > > > main(void) > > > { > > > int i; > > > > > > i = (unsigned char)1 - (unsigned char)2; > > > printf("%d\n", i); > > > } > > > > It very depends on compiler, i.e. does it implements "value preseving" or > > "unsigned preserving" for 'char' type conversions. Or ANSI C vs. common C > > mode. Better be safe for both. > > > > Read 6.10.1.1 section here: > > http://wwwrsphysse.anu.edu.au/doc/DUhelp/AQTLTBTE/DOCU_067.HTM For ANSI C, the result of the subtraction only depends on the width of unsigned char. If unsigned char has the same width as int, then the result is UINT_MAX; otherwise the result is -1. This is an example of the brokenness of "value preserving" conversions -- the value is as far as possible from being preserved. Then assignment to "int i" may cause overflow. There is no overflow if the RHS is -1. If the RHS is UINT_MAX, then the result of the assignment is implementation-defined. The value is is preserved even less than before. I think it is usually -0 on 1's complement machines. So ache's changes is basically a fix for 1's complement machines. I don't see much point in it, sincw we assume 2's complement in most places in libc/string (except strcoll() :-). E.g., memcmp() just subtracts the unsigned char's and assume that all the conversions turn out like they do on 2's complement machines. We actually use an assembler version of memcmp on most arches but... > This is handled by the -traditional flag of gcc(1): > > : `-traditional' > : > : Attempt to support some aspects of traditional C compilers. > : Specifically: > : > [...] > : > : * Integer types `unsigned short' and `unsigned char' promote to > : `unsigned int'. > > With -traditional, the code I quoted still produces -1. It produces overflow which normally gives -1 on 2's complement machines. > In any case, this section doesn't apply to this case because > no conversion described in section 6.10 is ever done here, > since both operands are of the same type, "unsigned char". Yes it does. The common type (for arithmetic operators like subtraction) is never smaller than int. Both of the unsigned char operands get converted to int in the simplest case where unsigned char is smaller than int. See 6.10.1 (5) and 6.10.1.1 about "integral promotions". Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message