From owner-svn-src-all@FreeBSD.ORG Wed Feb 25 21:10:05 2015 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 191ECE4E; Wed, 25 Feb 2015 21:10:05 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0295D9A5; Wed, 25 Feb 2015 21:10:05 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.9/8.14.9) with ESMTP id t1PLA4Ph030716; Wed, 25 Feb 2015 21:10:04 GMT (envelope-from hselasky@FreeBSD.org) Received: (from hselasky@localhost) by svn.freebsd.org (8.14.9/8.14.9/Submit) id t1PLA4q8030712; Wed, 25 Feb 2015 21:10:04 GMT (envelope-from hselasky@FreeBSD.org) Message-Id: <201502252110.t1PLA4q8030712@svn.freebsd.org> X-Authentication-Warning: svn.freebsd.org: hselasky set sender to hselasky@FreeBSD.org using -f From: Hans Petter Selasky Date: Wed, 25 Feb 2015 21:10:04 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r279297 - head/usr.bin/unifdef X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Feb 2015 21:10:05 -0000 Author: hselasky Date: Wed Feb 25 21:10:03 2015 New Revision: 279297 URL: https://svnweb.freebsd.org/changeset/base/279297 Log: Update to upstream version 2.10 The most notable new feature is support for definition files. Obtained from: http://dotat.at/prog/unifdef MFC after: 1 week Modified: head/usr.bin/unifdef/unifdef.1 head/usr.bin/unifdef/unifdef.c head/usr.bin/unifdef/unifdef.h head/usr.bin/unifdef/unifdefall.sh Modified: head/usr.bin/unifdef/unifdef.1 ============================================================================== --- head/usr.bin/unifdef/unifdef.1 Wed Feb 25 20:47:25 2015 (r279296) +++ head/usr.bin/unifdef/unifdef.1 Wed Feb 25 21:10:03 2015 (r279297) @@ -31,7 +31,7 @@ .\" .\" $FreeBSD$ .\" -.Dd February 21, 2012 +.Dd January 7, 2014 .Dt UNIFDEF 1 PRM .Os " " .Sh NAME @@ -44,6 +44,7 @@ .Op Fl [i]D Ns Ar sym Ns Op = Ns Ar val .Op Fl [i]U Ns Ar sym .Ar ... +.Op Fl f Ar defile .Op Fl x Bro Ar 012 Brc .Op Fl M Ar backext .Op Fl o Ar outfile @@ -70,11 +71,17 @@ utility acts on .Ic #elif , #else , and .Ic #endif -lines. -A directive is only processed -if the symbols specified on the command line are sufficient to allow -.Nm -to get a definite value for its control expression. +lines, +using macros specified in +.Fl D +and +.Fl U +command line options or in +.Fl f +definitions files. +A directive is processed +if the macro specifications are sufficient to provide +a definite value for its control expression. If the result is false, the directive and the following lines under its control are removed. If the result is true, @@ -84,7 +91,7 @@ An or .Ic #ifndef directive is passed through unchanged -if its controlling symbol is not specified on the command line. +if its controlling macro is not specified. Any .Ic #if or @@ -110,7 +117,7 @@ and .Ic #elif lines: integer constants, -integer values of symbols defined on the command line, +integer values of macros defined on the command line, the .Fn defined operator, @@ -131,16 +138,42 @@ if either operand of .Ic || is definitely true then the result is true. .Pp -In most cases, the +When evaluating an expression, .Nm -utility does not distinguish between object-like macros -(without arguments) and function-like arguments (with arguments). -If a macro is not explicitly defined, or is defined with the +does not expand macros first. +The value of a macro must be a simple number, +not an expression. +A limited form of indirection is allowed, +where one macro's value is the name of another. +.Pp +In most cases, +.Nm +does not distinguish between object-like macros +(without arguments) and function-like macros (with arguments). +A function-like macro invocation can appear in +.Ic #if +and +.Ic #elif +control expressions. +If the macro is not explicitly defined, +or is defined with the .Fl D -flag on the command-line, its arguments are ignored. +flag on the command-line, +or with +.Ic #define +in a +.Fl f +definitions file, +its arguments are ignored. If a macro is explicitly undefined on the command line with the .Fl U -flag, it may not have any arguments since this leads to a syntax error. +flag, +or with +.Ic #undef +in a +.Fl f +definitions file, +it may not have any arguments since this leads to a syntax error. .Pp The .Nm @@ -161,7 +194,7 @@ It uses .Nm Fl s and .Nm cpp Fl dM -to get lists of all the controlling symbols +to get lists of all the controlling macros and their definitions (or lack thereof), then invokes .Nm @@ -169,19 +202,15 @@ with appropriate arguments to process th .Sh OPTIONS .Bl -tag -width indent -compact .It Fl D Ns Ar sym Ns = Ns Ar val -Specify that a symbol is defined to a given value -which is used when evaluating -.Ic #if -and -.Ic #elif -control expressions. +Specify that a macro is defined to a given value. .Pp .It Fl D Ns Ar sym -Specify that a symbol is defined to the value 1. +Specify that a macro is defined to the value 1. .Pp .It Fl U Ns Ar sym -Specify that a symbol is undefined. -If the same symbol appears in more than one argument, +Specify that a macro is undefined. +.Pp +If the same macro appears in more than one argument, the last occurrence dominates. .Pp .It Fl iD Ns Ar sym Ns Op = Ns Ar val @@ -193,9 +222,37 @@ are ignored within and .Ic #ifndef blocks -controlled by symbols +controlled by macros specified with these options. .Pp +.It Fl f Ar defile +The file +.Ar defile +contains +.Ic #define +and +.Ic #undef +preprocessor directives, +which have the same effect as the corresponding +.Fl D +and +.Fl U +command-line arguments. +You can have multiple +.Fl f +arguments and mix them with +.Fl D +and +.Fl U +arguments; +later options override earlier ones. +.Pp +Each directive must be on a single line. +Object-like macro definitions (without arguments) +are set to the given value. +Function-like macro definitions (with arguments) +are treated as if they are set to 1. +.Pp .It Fl b Replace removed lines with blank lines instead of deleting them. @@ -291,24 +348,15 @@ instead of the standard output when proc Instead of processing an input file as usual, this option causes .Nm -to produce a list of symbols that appear in expressions -that -.Nm -understands. -It is useful in conjunction with the -.Fl dM -option of -.Xr cpp 1 -for creating -.Nm -command lines. +to produce a list of macros that are used in +preprocessor directive controlling expressions. .Pp .It Fl S Like the .Fl s -option, but the nesting depth of each symbol is also printed. +option, but the nesting depth of each macro is also printed. This is useful for working out the number of possible combinations -of interdependent defined/undefined symbols. +of interdependent defined/undefined macros. .Pp .It Fl t Disables parsing for C strings, comments, @@ -406,6 +454,9 @@ in comment. .Sh SEE ALSO .Xr cpp 1 , .Xr diff 1 +.Pp +The unifdef home page is +.Pa http://dotat.at/prog/unifdef .Sh HISTORY The .Nm @@ -431,7 +482,7 @@ cannot be handled in every situation. .Pp Trigraphs are not recognized. .Pp -There is no support for symbols with different definitions at +There is no support for macros with different definitions at different points in the source file. .Pp The text-mode and ignore functionality does not correspond to modern Modified: head/usr.bin/unifdef/unifdef.c ============================================================================== --- head/usr.bin/unifdef/unifdef.c Wed Feb 25 20:47:25 2015 (r279296) +++ head/usr.bin/unifdef/unifdef.c Wed Feb 25 21:10:03 2015 (r279297) @@ -1,5 +1,5 @@ /* - * Copyright (c) 2002 - 2013 Tony Finch + * Copyright (c) 2002 - 2014 Tony Finch * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -46,7 +46,7 @@ #include "unifdef.h" static const char copyright[] = - "@(#) $Version: unifdef-2.7 $\n" + "@(#) $Version: unifdef-2.10 $\n" "@(#) $FreeBSD$\n" "@(#) $Author: Tony Finch (dot@dotat.at) $\n" "@(#) $URL: http://dotat.at/prog/unifdef $\n" @@ -138,7 +138,7 @@ static char const * const linestate_name */ #define MAXDEPTH 64 /* maximum #if nesting */ #define MAXLINE 4096 /* maximum length of line */ -#define MAXSYMS 4096 /* maximum number of symbols */ +#define MAXSYMS 16384 /* maximum number of symbols */ /* * Sometimes when editing a keyword the replacement text is longer, so @@ -180,6 +180,15 @@ static char *tempname; /* av static char tline[MAXLINE+EDITSLOP];/* input buffer plus space */ static char *keyword; /* used for editing #elif's */ +/* + * When processing a file, the output's newline style will match the + * input's, and unifdef correctly handles CRLF or LF endings whatever + * the platform's native style. The stdio streams are opened in binary + * mode to accommodate platforms whose native newline style is CRLF. + * When the output isn't a processed input file (when it is error / + * debug / diagnostic messages) then unifdef uses native line endings. + */ + static const char *newline; /* input file format */ static const char newline_unix[] = "\n"; static const char newline_crlf[] = "\r\n"; @@ -200,33 +209,41 @@ static bool firstsym; /* di static int exitmode; /* exit status mode */ static int exitstat; /* program exit status */ -static void addsym(bool, bool, char *); +static void addsym1(bool, bool, char *); +static void addsym2(bool, const char *, const char *); static char *astrcat(const char *, const char *); static void cleantemp(void); -static void closeout(void); +static void closeio(void); static void debug(const char *, ...); +static void debugsym(const char *, int); +static bool defundef(void); +static void defundefile(const char *); static void done(void); static void error(const char *); -static int findsym(const char *); +static int findsym(const char **); static void flushline(bool); static void hashline(void); static void help(void); static Linetype ifeval(const char **); static void ignoreoff(void); static void ignoreon(void); +static void indirectsym(void); static void keywordedit(const char *); +static const char *matchsym(const char *, const char *); static void nest(void); static Linetype parseline(void); static void process(void); static void processinout(const char *, const char *); static const char *skipargs(const char *); static const char *skipcomment(const char *); +static const char *skiphash(void); +static const char *skipline(const char *); static const char *skipsym(const char *); static void state(Ifstate); -static int strlcmp(const char *, const char *, size_t); static void unnest(void); static void usage(void); static void version(void); +static const char *xstrdup(const char *, const char *); #define endsym(c) (!isalnum((unsigned char)c) && c != '_') @@ -238,7 +255,7 @@ main(int argc, char *argv[]) { int opt; - while ((opt = getopt(argc, argv, "i:D:U:I:M:o:x:bBcdehKklmnsStV")) != -1) + while ((opt = getopt(argc, argv, "i:D:U:f:I:M:o:x:bBcdehKklmnsStV")) != -1) switch (opt) { case 'i': /* treat stuff controlled by these symbols as text */ /* @@ -248,17 +265,17 @@ main(int argc, char *argv[]) */ opt = *optarg++; if (opt == 'D') - addsym(true, true, optarg); + addsym1(true, true, optarg); else if (opt == 'U') - addsym(true, false, optarg); + addsym1(true, false, optarg); else usage(); break; case 'D': /* define a symbol */ - addsym(false, true, optarg); + addsym1(false, true, optarg); break; case 'U': /* undef a symbol */ - addsym(false, false, optarg); + addsym1(false, false, optarg); break; case 'I': /* no-op for compatibility with cpp */ break; @@ -278,6 +295,9 @@ main(int argc, char *argv[]) case 'e': /* fewer errors from dodgy lines */ iocccok = true; break; + case 'f': /* definitions file */ + defundefile(optarg); + break; case 'h': help(); break; @@ -334,6 +354,7 @@ main(int argc, char *argv[]) argc = 1; if (argc == 1 && !inplace && ofilename == NULL) ofilename = "-"; + indirectsym(); atexit(cleantemp); if (ofilename != NULL) @@ -392,10 +413,10 @@ processinout(const char *ifn, const char if (backext != NULL) { char *backname = astrcat(ofn, backext); if (rename(ofn, backname) < 0) - err(2, "can't rename \"%s\" to \"%s%s\"", ofn, ofn, backext); + err(2, "can't rename \"%s\" to \"%s\"", ofn, backname); free(backname); } - if (rename(tempname, ofn) < 0) + if (replace(tempname, ofn) < 0) err(2, "can't rename \"%s\" to \"%s\"", tempname, ofn); free(tempname); tempname = NULL; @@ -434,7 +455,7 @@ synopsis(FILE *fp) { fprintf(fp, "usage: unifdef [-bBcdehKkmnsStV] [-x{012}] [-Mext] [-opath] \\\n" - " [-[i]Dsym[=val]] [-[i]Usym] ... [file] ...\n"); + " [-[i]Dsym[=val]] [-[i]Usym] [-fpath] ... [file] ...\n"); } static void @@ -455,6 +476,7 @@ help(void) " -iDsym=val \\ ignore C strings and comments\n" " -iDsym ) in sections controlled by these\n" " -iUsym / preprocessor symbols\n" + " -fpath file containing #define and #undef directives\n" " -b blank lines instead of deleting them\n" " -B compress blank lines around deleted section\n" " -c complement (invert) keep vs. delete\n" @@ -650,12 +672,12 @@ done(void) { if (incomment) error("EOF in comment"); - closeout(); + closeio(); } /* * Write a line to the output or not, according to command line options. - * If writing fails, closeout() will print the error and exit. + * If writing fails, closeio() will print the error and exit. */ static void flushline(bool keep) @@ -671,19 +693,19 @@ flushline(bool keep) if (lnnum && delcount > 0) hashline(); if (fputs(tline, output) == EOF) - closeout(); + closeio(); delcount = 0; blankmax = blankcount = blankline ? blankcount + 1 : 0; } } else { if (lnblank && fputs(newline, output) == EOF) - closeout(); + closeio(); exitstat = 1; delcount += 1; blankcount = 0; } if (debugging && fflush(output) == EOF) - closeout(); + closeio(); } /* @@ -700,20 +722,21 @@ hashline(void) e = fprintf(output, "#line %d \"%s\"%s", linenum, linefile, newline); if (e < 0) - closeout(); + closeio(); } /* * Flush the output and handle errors. */ static void -closeout(void) +closeio(void) { /* Tidy up after findsym(). */ if (symdepth && !zerosyms) printf("\n"); - if (ferror(output) || fclose(output) == EOF) - err(2, "%s: can't write to output", filename); + if (output != NULL && (ferror(output) || fclose(output) == EOF)) + err(2, "%s: can't write to output", filename); + fclose(input); } /* @@ -727,6 +750,8 @@ process(void) is preceded by a large number of blank lines. */ blankmax = blankcount = 1000; zerosyms = true; + newline = NULL; + linenum = 0; while (lineval != LT_EOF) { lineval = parseline(); trans_table[ifstate[depth]][lineval](); @@ -746,105 +771,86 @@ parseline(void) { const char *cp; int cursym; - int kwlen; Linetype retval; Comment_state wascomment; - linenum++; - if (fgets(tline, MAXLINE, input) == NULL) { - if (ferror(input)) - err(2, "can't read %s", filename); - else - return (LT_EOF); - } + wascomment = incomment; + cp = skiphash(); + if (cp == NULL) + return (LT_EOF); if (newline == NULL) { if (strrchr(tline, '\n') == strrchr(tline, '\r') + 1) newline = newline_crlf; else newline = newline_unix; } - retval = LT_PLAIN; - wascomment = incomment; - cp = skipcomment(tline); - if (linestate == LS_START) { - if (*cp == '#') { - linestate = LS_HASH; - firstsym = true; - cp = skipcomment(cp + 1); - } else if (*cp != '\0') - linestate = LS_DIRTY; - } - if (!incomment && linestate == LS_HASH) { - keyword = tline + (cp - tline); - cp = skipsym(cp); - kwlen = cp - keyword; + if (*cp == '\0') { + retval = LT_PLAIN; + goto done; + } + keyword = tline + (cp - tline); + if ((cp = matchsym("ifdef", keyword)) != NULL || + (cp = matchsym("ifndef", keyword)) != NULL) { + cp = skipcomment(cp); + if ((cursym = findsym(&cp)) < 0) + retval = LT_IF; + else { + retval = (keyword[2] == 'n') + ? LT_FALSE : LT_TRUE; + if (value[cursym] == NULL) + retval = (retval == LT_TRUE) + ? LT_FALSE : LT_TRUE; + if (ignore[cursym]) + retval = (retval == LT_TRUE) + ? LT_TRUEI : LT_FALSEI; + } + } else if ((cp = matchsym("if", keyword)) != NULL) + retval = ifeval(&cp); + else if ((cp = matchsym("elif", keyword)) != NULL) + retval = linetype_if2elif(ifeval(&cp)); + else if ((cp = matchsym("else", keyword)) != NULL) + retval = LT_ELSE; + else if ((cp = matchsym("endif", keyword)) != NULL) + retval = LT_ENDIF; + else { + cp = skipsym(keyword); /* no way can we deal with a continuation inside a keyword */ if (strncmp(cp, "\\\r\n", 3) == 0 || strncmp(cp, "\\\n", 2) == 0) Eioccc(); - if (strlcmp("ifdef", keyword, kwlen) == 0 || - strlcmp("ifndef", keyword, kwlen) == 0) { - cp = skipcomment(cp); - if ((cursym = findsym(cp)) < 0) - retval = LT_IF; - else { - retval = (keyword[2] == 'n') - ? LT_FALSE : LT_TRUE; - if (value[cursym] == NULL) - retval = (retval == LT_TRUE) - ? LT_FALSE : LT_TRUE; - if (ignore[cursym]) - retval = (retval == LT_TRUE) - ? LT_TRUEI : LT_FALSEI; - } - cp = skipsym(cp); - } else if (strlcmp("if", keyword, kwlen) == 0) - retval = ifeval(&cp); - else if (strlcmp("elif", keyword, kwlen) == 0) - retval = linetype_if2elif(ifeval(&cp)); - else if (strlcmp("else", keyword, kwlen) == 0) - retval = LT_ELSE; - else if (strlcmp("endif", keyword, kwlen) == 0) - retval = LT_ENDIF; - else { - linestate = LS_DIRTY; - retval = LT_PLAIN; - } - cp = skipcomment(cp); - if (*cp != '\0') { + cp = skipline(cp); + retval = LT_PLAIN; + goto done; + } + cp = skipcomment(cp); + if (*cp != '\0') { + cp = skipline(cp); + if (retval == LT_TRUE || retval == LT_FALSE || + retval == LT_TRUEI || retval == LT_FALSEI) + retval = LT_IF; + if (retval == LT_ELTRUE || retval == LT_ELFALSE) + retval = LT_ELIF; + } + /* the following can happen if the last line of the file lacks a + newline or if there is too much whitespace in a directive */ + if (linestate == LS_HASH) { + long len = cp - tline; + if (fgets(tline + len, MAXLINE - len, input) == NULL) { + if (ferror(input)) + err(2, "can't read %s", filename); + /* append the missing newline at eof */ + strcpy(tline + len, newline); + cp += strlen(newline); + linestate = LS_START; + } else { linestate = LS_DIRTY; - if (retval == LT_TRUE || retval == LT_FALSE || - retval == LT_TRUEI || retval == LT_FALSEI) - retval = LT_IF; - if (retval == LT_ELTRUE || retval == LT_ELFALSE) - retval = LT_ELIF; - } - if (retval != LT_PLAIN && (wascomment || incomment)) { - retval = linetype_2dodgy(retval); - if (incomment) - linestate = LS_DIRTY; - } - /* skipcomment normally changes the state, except - if the last line of the file lacks a newline, or - if there is too much whitespace in a directive */ - if (linestate == LS_HASH) { - size_t len = cp - tline; - if (fgets(tline + len, MAXLINE - len, input) == NULL) { - if (ferror(input)) - err(2, "can't read %s", filename); - /* append the missing newline at eof */ - strcpy(tline + len, newline); - cp += strlen(newline); - linestate = LS_START; - } else { - linestate = LS_DIRTY; - } } } - if (linestate == LS_DIRTY) { - while (*cp != '\0') - cp = skipcomment(cp + 1); + if (retval != LT_PLAIN && (wascomment || linestate != LS_START)) { + retval = linetype_2dodgy(retval); + linestate = LS_DIRTY; } +done: debug("parser line %d state %s comment %s line", linenum, comment_name[incomment], linestate_name[linestate]); return (retval); @@ -854,34 +860,34 @@ parseline(void) * These are the binary operators that are supported by the expression * evaluator. */ -static Linetype op_strict(int *p, int v, Linetype at, Linetype bt) { +static Linetype op_strict(long *p, long v, Linetype at, Linetype bt) { if(at == LT_IF || bt == LT_IF) return (LT_IF); return (*p = v, v ? LT_TRUE : LT_FALSE); } -static Linetype op_lt(int *p, Linetype at, int a, Linetype bt, int b) { +static Linetype op_lt(long *p, Linetype at, long a, Linetype bt, long b) { return op_strict(p, a < b, at, bt); } -static Linetype op_gt(int *p, Linetype at, int a, Linetype bt, int b) { +static Linetype op_gt(long *p, Linetype at, long a, Linetype bt, long b) { return op_strict(p, a > b, at, bt); } -static Linetype op_le(int *p, Linetype at, int a, Linetype bt, int b) { +static Linetype op_le(long *p, Linetype at, long a, Linetype bt, long b) { return op_strict(p, a <= b, at, bt); } -static Linetype op_ge(int *p, Linetype at, int a, Linetype bt, int b) { +static Linetype op_ge(long *p, Linetype at, long a, Linetype bt, long b) { return op_strict(p, a >= b, at, bt); } -static Linetype op_eq(int *p, Linetype at, int a, Linetype bt, int b) { +static Linetype op_eq(long *p, Linetype at, long a, Linetype bt, long b) { return op_strict(p, a == b, at, bt); } -static Linetype op_ne(int *p, Linetype at, int a, Linetype bt, int b) { +static Linetype op_ne(long *p, Linetype at, long a, Linetype bt, long b) { return op_strict(p, a != b, at, bt); } -static Linetype op_or(int *p, Linetype at, int a, Linetype bt, int b) { +static Linetype op_or(long *p, Linetype at, long a, Linetype bt, long b) { if (!strictlogic && (at == LT_TRUE || bt == LT_TRUE)) return (*p = 1, LT_TRUE); return op_strict(p, a || b, at, bt); } -static Linetype op_and(int *p, Linetype at, int a, Linetype bt, int b) { +static Linetype op_and(long *p, Linetype at, long a, Linetype bt, long b) { if (!strictlogic && (at == LT_FALSE || bt == LT_FALSE)) return (*p = 0, LT_FALSE); return op_strict(p, a && b, at, bt); @@ -899,7 +905,7 @@ static Linetype op_and(int *p, Linetype */ struct ops; -typedef Linetype eval_fn(const struct ops *, int *, const char **); +typedef Linetype eval_fn(const struct ops *, long *, const char **); static eval_fn eval_table, eval_unary; @@ -912,7 +918,7 @@ static eval_fn eval_table, eval_unary; */ struct op { const char *str; - Linetype (*fn)(int *, Linetype, int, Linetype, int); + Linetype (*fn)(long *, Linetype, long, Linetype, long); }; struct ops { eval_fn *inner; @@ -930,7 +936,7 @@ static const struct ops eval_ops[] = { }; /* Current operator precedence level */ -static int prec(const struct ops *ops) +static long prec(const struct ops *ops) { return (ops - eval_ops); } @@ -941,7 +947,7 @@ static int prec(const struct ops *ops) * We reset the constexpr flag in the last two cases. */ static Linetype -eval_unary(const struct ops *ops, int *valp, const char **cpp) +eval_unary(const struct ops *ops, long *valp, const char **cpp) { const char *cp; char *ep; @@ -975,32 +981,33 @@ eval_unary(const struct ops *ops, int *v if (ep == cp) return (LT_ERROR); lt = *valp ? LT_TRUE : LT_FALSE; - cp = skipsym(cp); - } else if (strncmp(cp, "defined", 7) == 0 && endsym(cp[7])) { + cp = ep; + } else if (matchsym("defined", cp) != NULL) { cp = skipcomment(cp+7); - debug("eval%d defined", prec(ops)); if (*cp == '(') { cp = skipcomment(cp+1); defparen = true; } else { defparen = false; } - sym = findsym(cp); + sym = findsym(&cp); + cp = skipcomment(cp); + if (defparen && *cp++ != ')') { + debug("eval%d defined missing ')'", prec(ops)); + return (LT_ERROR); + } if (sym < 0) { + debug("eval%d defined unknown", prec(ops)); lt = LT_IF; } else { + debug("eval%d defined %s", prec(ops), symname[sym]); *valp = (value[sym] != NULL); lt = *valp ? LT_TRUE : LT_FALSE; } - cp = skipsym(cp); - cp = skipcomment(cp); - if (defparen && *cp++ != ')') - return (LT_ERROR); constexpr = false; } else if (!endsym(*cp)) { debug("eval%d symbol", prec(ops)); - sym = findsym(cp); - cp = skipsym(cp); + sym = findsym(&cp); if (sym < 0) { lt = LT_IF; cp = skipargs(cp); @@ -1029,11 +1036,11 @@ eval_unary(const struct ops *ops, int *v * Table-driven evaluation of binary operators. */ static Linetype -eval_table(const struct ops *ops, int *valp, const char **cpp) +eval_table(const struct ops *ops, long *valp, const char **cpp) { const struct op *op; const char *cp; - int val; + long val; Linetype lt, rt; debug("eval%d", prec(ops)); @@ -1071,7 +1078,7 @@ static Linetype ifeval(const char **cpp) { Linetype ret; - int val = 0; + long val = 0; debug("eval %s", *cpp); constexpr = killconsts ? false : true; @@ -1081,6 +1088,49 @@ ifeval(const char **cpp) } /* + * Read a line and examine its initial part to determine if it is a + * preprocessor directive. Returns NULL on EOF, or a pointer to a + * preprocessor directive name, or a pointer to the zero byte at the + * end of the line. + */ +static const char * +skiphash(void) +{ + const char *cp; + + linenum++; + if (fgets(tline, MAXLINE, input) == NULL) { + if (ferror(input)) + err(2, "can't read %s", filename); + else + return (NULL); + } + cp = skipcomment(tline); + if (linestate == LS_START && *cp == '#') { + linestate = LS_HASH; + return (skipcomment(cp + 1)); + } else if (*cp == '\0') { + return (cp); + } else { + return (skipline(cp)); + } +} + +/* + * Mark a line dirty and consume the rest of it, keeping track of the + * lexical state. + */ +static const char * +skipline(const char *cp) +{ + if (*cp != '\0') + linestate = LS_DIRTY; + while (*cp != '\0') + cp = skipcomment(cp + 1); + return (cp); +} + +/* * Skip over comments, strings, and character literals and stop at the * next character position that is not whitespace. Between calls we keep * the comment state in the global variable incomment, and we also adjust @@ -1233,33 +1283,69 @@ skipsym(const char *cp) } /* + * Skip whitespace and take a copy of any following identifier. + */ +static const char * +getsym(const char **cpp) +{ + const char *cp = *cpp, *sym; + + cp = skipcomment(cp); + cp = skipsym(sym = cp); + if (cp == sym) + return NULL; + *cpp = cp; + return (xstrdup(sym, cp)); +} + +/* + * Check that s (a symbol) matches the start of t, and that the + * following character in t is not a symbol character. Returns a + * pointer to the following character in t if there is a match, + * otherwise NULL. + */ +static const char * +matchsym(const char *s, const char *t) +{ + while (*s != '\0' && *t != '\0') + if (*s != *t) + return (NULL); + else + ++s, ++t; + if (*s == '\0' && endsym(*t)) + return(t); + else + return(NULL); +} + +/* * Look for the symbol in the symbol table. If it is found, we return * the symbol table index, else we return -1. */ static int -findsym(const char *str) +findsym(const char **strp) { - const char *cp; + const char *str; int symind; - cp = skipsym(str); - if (cp == str) - return (-1); + str = *strp; + *strp = skipsym(str); if (symlist) { + if (*strp == str) + return (-1); if (symdepth && firstsym) printf("%s%3d", zerosyms ? "" : "\n", depth); firstsym = zerosyms = false; printf("%s%.*s%s", - symdepth ? " " : "", - (int)(cp-str), str, - symdepth ? "" : "\n"); + symdepth ? " " : "", + (int)(*strp-str), str, + symdepth ? "" : "\n"); /* we don't care about the value of the symbol */ return (0); } for (symind = 0; symind < nsyms; ++symind) { - if (strlcmp(symname[symind], str, cp-str) == 0) { - debug("findsym %s %s", symname[symind], - value[symind] ? value[symind] : ""); + if (matchsym(symname[symind], str) != NULL) { + debugsym("findsym", symind); return (symind); } } @@ -1267,53 +1353,155 @@ findsym(const char *str) } /* + * Resolve indirect symbol values to their final definitions. + */ +static void +indirectsym(void) +{ + const char *cp; + int changed, sym, ind; + + do { + changed = 0; + for (sym = 0; sym < nsyms; ++sym) { + if (value[sym] == NULL) + continue; + cp = value[sym]; + ind = findsym(&cp); + if (ind == -1 || ind == sym || + *cp != '\0' || + value[ind] == NULL || + value[ind] == value[sym]) + continue; + debugsym("indir...", sym); + value[sym] = value[ind]; + debugsym("...ectsym", sym); + changed++; + } + } while (changed); +} + +/* + * Add a symbol to the symbol table, specified with the format sym=val + */ +static void +addsym1(bool ignorethis, bool definethis, char *symval) +{ + const char *sym, *val; + + sym = symval; + val = skipsym(sym); + if (definethis && *val == '=') { + symval[val - sym] = '\0'; + val = val + 1; + } else if (*val == '\0') { + val = definethis ? "1" : NULL; + } else { + usage(); + } + addsym2(ignorethis, sym, val); +} + +/* * Add a symbol to the symbol table. */ static void -addsym(bool ignorethis, bool definethis, char *sym) +addsym2(bool ignorethis, const char *sym, const char *val) { + const char *cp = sym; int symind; - char *val; - symind = findsym(sym); + symind = findsym(&cp); if (symind < 0) { if (nsyms >= MAXSYMS) errx(2, "too many symbols"); symind = nsyms++; } - symname[symind] = sym; ignore[symind] = ignorethis; - val = sym + (skipsym(sym) - sym); - if (definethis) { - if (*val == '=') { - value[symind] = val+1; - *val = '\0'; - } else if (*val == '\0') - value[symind] = "1"; - else - usage(); *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***