From owner-freebsd-hackers Sat Aug 11 12:20:48 2001 Delivered-To: freebsd-hackers@freebsd.org Received: from ringworld.nanolink.com (discworld.nanolink.com [217.75.135.248]) by hub.freebsd.org (Postfix) with SMTP id 7B73037B403 for ; Sat, 11 Aug 2001 12:20:40 -0700 (PDT) (envelope-from roam@ringlet.net) Received: (qmail 11068 invoked by uid 1000); 11 Aug 2001 19:19:24 -0000 Date: Sat, 11 Aug 2001 22:19:24 +0300 From: Peter Pentchev To: brian o'shea Cc: Raymond Wiker , les@safety.net, Rob , "hackers@FreeBSD.ORG" Subject: Re: the =+ operator Message-ID: <20010811221924.C1848@ringworld.oblivion.bg> Mail-Followup-To: brian o'shea , Raymond Wiker , les@safety.net, Rob , "hackers@FreeBSD.ORG" References: <3B73F0BC.548D40B3@home.com> <200108101446.HAA99867@safety.net> <15220.9386.441669.962830@raw.grenland.fast.no> <20010810154630.A27553@netapp.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010810154630.A27553@netapp.com>; from boshea@netapp.com on Fri, Aug 10, 2001 at 03:46:30PM -0700 Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Fri, Aug 10, 2001 at 03:46:30PM -0700, brian o'shea wrote: > On Fri, Aug 10, 2001 at 08:15:06PM +0200, Raymond Wiker wrote: > > > > This is actually wrong - the += and -= operators were > > originally written as =+ and =-. This is obviously ambiguous, given > > the fact that whitespace is ignored. > > Why does this have to be ambiguous? Your largest-possible-token sounds like a good and quite possibly correct idea, but your example is mightily wrong :) > Consider this: > > int o = 2; > int *p = &o; > int q = 8; > int r; > > r = q/*p /* comment */; > > printf("r == %d\n", r); > > > Is this equivalent to the following: > > r = q / *p; > > > Or is it equivalent to this: > > r = q /* p /* comment */ ; > > > C disambiguates between these two possible interpretations by matching > the largest possible token. I don't really think so. C disambiguates between these two possible interpretations by doing at least two passes over the code. Comments are weeded out by the C preprocessor, which jumps in at the first /*, and replaces everything until the first */ with whitespace. The compiler does not have to deal with the */ ambiguity at all, since it never sees the /*. > Thus, it is taken to be equivalent to: > > r = q; No, it's not. /* cannot be used in this way, it *is* taken to mean a comment by the preprocessor, and it produces weird mistakes. See for yourself: try to compile the following program: int main(void) { int p, *q; p = 2; q = &p; p = p /* q; /* comment */ } At least here, this is what happens: [roam@ringworld:v6 ~/c/misc/foo]$ cc -o foo10 foo10.c foo10.c: In function `main': foo10.c:8: syntax error before `}' [roam@ringworld:v6 ~/c/misc/foo]$ That is, the preprocessor has taken the whole of /* q; /* comment */ to be a comment, and has turned the program into: int main(void) { int p, *q; p = 2; q = &p; p = p } ..which is obviously wrong - no semicolon after the p = p statement. > In other words, these two lines are not equivalent: > > r = q/*p /* comment */; > > r = q/ *p /* comment */; > > > So, the =+ operator could be interprited correctly as long as it is the > largest possible token. It does leave more of an opportunity for human > misinterpritation, while my example is less likely to be seen. Once again, yes, the largest-possible-token is a good idea, and possibly implemented that way in most lex or flex-based parsers, but you picked a poor example to illustrate it :) G'luck, Peter -- If I were you, who would be reading this sentence? To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message