From owner-svn-src-all@FreeBSD.ORG Mon Jun 21 17:22:45 2010 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 316AA1065675; Mon, 21 Jun 2010 17:22:45 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail02.syd.optusnet.com.au (mail02.syd.optusnet.com.au [211.29.132.183]) by mx1.freebsd.org (Postfix) with ESMTP id 3044F8FC14; Mon, 21 Jun 2010 17:22:43 +0000 (UTC) Received: from c122-106-145-229.carlnfd1.nsw.optusnet.com.au (c122-106-145-229.carlnfd1.nsw.optusnet.com.au [122.106.145.229]) by mail02.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id o5LHMeGw009180 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 22 Jun 2010 03:22:41 +1000 Date: Tue, 22 Jun 2010 03:22:40 +1000 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Marcel Moolenaar In-Reply-To: <201006200034.o5K0Y6xl041024@svn.freebsd.org> Message-ID: <20100622024652.C43995@delplex.bde.org> References: <201006200034.o5K0Y6xl041024@svn.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r209358 - head/cddl/contrib/opensolaris/lib/libdtrace/common X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2010 17:22:45 -0000 On Sun, 20 Jun 2010, Marcel Moolenaar wrote: > Log: > Unbreak platforms with char unsigned by default. Oddly enough, GCC isn't > satisfied with a simple cast to int in the check against EOF, so the fix > is a bit involved by actually having to go through a temporary variable. Perhaps that is because gcc can see that the cast has no effect, so the comparison can never be true if `c' is an unsigned char (unless unsigned char has the same number of bits as signed int), but it cannot see that the conversion to the temporary variable has the same null effect. > Modified: head/cddl/contrib/opensolaris/lib/libdtrace/common/dt_lex.l > ============================================================================== > --- head/cddl/contrib/opensolaris/lib/libdtrace/common/dt_lex.l Sat Jun 19 22:13:40 2010 (r209357) > +++ head/cddl/contrib/opensolaris/lib/libdtrace/common/dt_lex.l Sun Jun 20 00:34:06 2010 (r209358) > @@ -67,8 +67,12 @@ > * for all subsequent invocations, which is the effect desired. > */ > #undef unput > -#define unput(c) \ > - if (c != EOF) yyunput( c, yytext_ptr ) > +#define unput(c) \ > + do { \ > + int _c = c; \ > + if (_c != EOF) \ > + yyunput(_c, yytext_ptr); \ > + } while(0) > #endif > > static int id_or_type(const char *); > This remains broken, especially on platforms with chars unsigned. No one should try to unput EOF, so c should never equal EOF, but if c is a negative character it may equal EOF and thus On platforms with chars unsigned (except exotic ones where chars have the same size as ints), if c is a char then it is >= 0 and thus cannot equal EOF (which is < 0). Since the platform is non-exotic, (int)c and "int _c = c;" equal c and are this still >= 0 and thus cannot equal EOF. Thus the comparison with EOF has no effect, and c is always unput. On platforms with chars signed, some chars may equal EOF. It is an error to unput almost any value held in a variable of type char, since its value might equal EOF and thus be rejected by unput(), but unput() should be able to handle any character in the character set. This problem is handled by ungetc() by always converting the value to unsigned char. Thus the value can never equal EOF, and the character set is effectively represented by unsigned char's, not the plain chars that stdio returns in some other interfaces (but not getc()). There seems to be no reason to break the warning about this instead of using the same approach as stdio. This depends on yyunput() not having similar bugs (it must take an arg of type int and convert to an unsigned cgar like ungetc()): #define unput(c) yyunput((unsigned char)(c), yytext_ptr) This also fixes the missing parantheses for 'c' and some style bugs. Bruce