From owner-freebsd-arch Fri Nov 2 18:21:56 2001 Delivered-To: freebsd-arch@freebsd.org Received: from flood.ping.uio.no (flood.ping.uio.no [129.240.78.31]) by hub.freebsd.org (Postfix) with ESMTP id 632EB37B408 for ; Fri, 2 Nov 2001 18:21:52 -0800 (PST) Received: by flood.ping.uio.no (Postfix, from userid 2602) id 229BA14C40; Sat, 3 Nov 2001 03:21:51 +0100 (CET) X-URL: http://www.ofug.org/~des/ X-Disclaimer: The views expressed in this message do not necessarily coincide with those of any organisation or company with which I am or have been affiliated. To: arch@freebsd.org Subject: POSIX character class support for 1Tawk From: Dag-Erling Smorgrav Date: 03 Nov 2001 03:21:50 +0100 Message-ID: Lines: 8 User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/20.7 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --=-=-= See attached patch (which I've also submitted to bwk). DES -- Dag-Erling Smorgrav - des@ofug.org --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=awk.diff Index: b.c =================================================================== RCS file: /home/ncvs/src/contrib/one-true-awk/b.c,v retrieving revision 1.1.1.1 diff -u -r1.1.1.1 b.c --- b.c 27 Oct 2001 08:07:37 -0000 1.1.1.1 +++ b.c 3 Nov 2001 01:14:16 -0000 @@ -683,12 +683,44 @@ } } +/* + * Character class definitions conformant to the POSIX locale as + * defined in IEEE P1003.1 draft 7 of June 2001, assuming the source + * and operating character sets are both ASCII (ISO646) or supersets + * thereof. + * + * Note that to avoid overflowing the temporary buffer used in + * relex(), the expanded character class (prior to range expansion) + * must be less than twice the size of their full name. + */ +struct charclass { + const uschar *cc_name; + int cc_namelen; + const uschar *cc_expand; +} charclasses[] = { + { "alnum", 5, "0-9A-Za-z" }, + { "alpha", 5, "A-Za-z" }, + { "blank", 5, " \t" }, + { "cntrl", 5, "\000-\037\177" }, + { "digit", 5, "0-9" }, + { "graph", 5, "\041-\176" }, + { "lower", 5, "a-z" }, + { "print", 5, " \041-\176" }, + { "punct", 5, "\041-\057\072-\100\133-\140\173-\176" }, + { "space", 5, " \f\n\r\t\v" }, + { "upper", 5, "A-Z" }, + { "xdigit", 6, "0-9A-Fa-f" }, + { NULL, 0, NULL }, +}; + int relex(void) /* lexical analyzer for reparse */ { + struct charclass *cc; int c, n; int cflag; static uschar *buf = 0; static int bufsz = 100; + const uschar *p; uschar *bp; switch (c = *prestr++) { @@ -730,6 +762,17 @@ *bp++ = c; /* } else if (c == '\n') { */ /* FATAL("newline in character class %.20s...", lastre); */ + } else if (c == '[' && *prestr == ':') { + for (cc = charclasses; cc->cc_name; cc++) + if (strncmp(prestr + 1, cc->cc_name, cc->cc_namelen) == 0) + break; + if (cc->cc_name != NULL && prestr[1 + cc->cc_namelen] == ':' && + prestr[2 + cc->cc_namelen] == ']') { + prestr += cc->cc_namelen + 3; + for (p = cc->cc_expand; *p; p++) + *bp++ = *p; + } else + *bp++ = c; } else if (c == '\0') { FATAL("nonterminated character class %.20s", lastre); } else if (bp == buf) { /* 1st char is special */ --=-=-=-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message