Date: Thu, 18 Apr 2002 23:18:46 -0700 (PDT) From: Wesley Irish <wirish@coyotehillconsulting.com> To: freebsd-gnats-submit@FreeBSD.org Subject: ports/37241: character ranges in regular expressions in nawk match one beyond the given range Message-ID: <200204190618.g3J6Ik430208@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 37241 >Category: ports >Synopsis: character ranges in regular expressions in nawk match one beyond the given range >Confidential: no >Severity: critical >Priority: high >Responsible: freebsd-ports >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu Apr 18 23:20:01 PDT 2002 >Closed-Date: >Last-Modified: >Originator: Wesley Irish >Release: 4.4-RELEASE >Organization: Coyote Hill Consulting LLC >Environment: FreeBSD wilee.coyotehillconsulting.com 4.4-RELEASE FreeBSD 4.4-RELEASE #0: Tue Sep 18 11:57:08 PDT 2001 murray@builder.FreeBSD.org:/usr/src/sys/compile/GENERIC i386 >Description: Character ranges in regular expression patterns are incorrect. They extend one beyond the intended end of the range. I have tested this for [0-9], [a-z], and [A-Z] (just to mention a few). I have also tested multiple ranges, such as [a-zA-Z] and the problem applies to both ranges within the same pattern. The pattern [a-zA-Z] matches latters and the characters "[" and "{". I am running nawk: PORTVERSION= 20001115 >How-To-Repeat: run the command: nawk '{printf("\"%s\" ~ \"%s\" == %s\n", $1, $2, $1 ~ $2)}' This one-liner will simply check if the first argument is matched by the second argument. Enter the following data to test for the problem: 0 [0-9] 9 [0-9] : [0-9] ; [0-9] / [0-9] You will see that "0" and "9" match (as they should). But you will also see that ":" matches as well, which it souldn't. The ";" does not match as the problem extends only "one beyond" the range. The "/" does not match indicating that there does not appear to be a problem at the beginning of the range. >Fix: Presently VERY awkward as all character ranges must be modified on the source to be one less than intended, such as [A-Y], or explicitly enumberated. I appologize in advance if the problem is due to me not running the very latest version of FreeBSD & ports or this is an already known and fixed bug. I tried to check a bug report log but I must not know how / where to check. If this is the case a pointer to such would be greatly appreciated. Thank you. >Release-Note: >Audit-Trail: >Unformatted: To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-ports" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200204190618.g3J6Ik430208>