Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 18 Apr 2002 23:18:46 -0700 (PDT)
From:      Wesley Irish <wirish@coyotehillconsulting.com>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   ports/37241: character ranges in regular expressions in nawk match one beyond the given range
Message-ID:  <200204190618.g3J6Ik430208@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         37241
>Category:       ports
>Synopsis:       character ranges in regular expressions in nawk match one beyond the given range
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-ports
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Apr 18 23:20:01 PDT 2002
>Closed-Date:
>Last-Modified:
>Originator:     Wesley Irish
>Release:        4.4-RELEASE
>Organization:
Coyote Hill Consulting LLC
>Environment:
FreeBSD wilee.coyotehillconsulting.com 4.4-RELEASE FreeBSD 4.4-RELEASE #0: Tue Sep 18 11:57:08 PDT 2001     murray@builder.FreeBSD.org:/usr/src/sys/compile/GENERIC  i386
>Description:
Character ranges in regular expression patterns are incorrect. They extend one beyond the intended end of the range. I have tested this for [0-9], [a-z], and [A-Z] (just to mention a few). I have also tested multiple ranges, such as [a-zA-Z] and the problem applies to both ranges within the same pattern. The pattern [a-zA-Z] matches latters and the characters "[" and "{".

I am running nawk: PORTVERSION=    20001115
>How-To-Repeat:
run the command:

nawk '{printf("\"%s\" ~ \"%s\" == %s\n", $1, $2, $1 ~ $2)}'

This one-liner will simply check if the first argument is matched by the second argument. Enter the following data to test for the problem:

0 [0-9]
9 [0-9]
: [0-9]
; [0-9]
/ [0-9]

You will see that "0" and "9" match (as they should).
But you will also see that ":" matches as well, which it souldn't.
The ";" does not match as the problem extends only "one beyond" the range. The "/" does not match indicating that there does not appear to be a problem at the beginning of the range.

>Fix:
Presently VERY awkward as all character ranges must be modified on the source to be one less than intended, such as [A-Y], or explicitly enumberated.

I appologize in advance if the problem is due to me not running the very latest version of FreeBSD & ports or this is an already known and fixed bug. I tried to check a bug report log but I must not know how / where to check. If this is the case a pointer to such would be greatly appreciated. Thank you.
>Release-Note:
>Audit-Trail:
>Unformatted:

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-ports" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200204190618.g3J6Ik430208>