Date: Sat, 12 Jul 2003 12:31:49 +0200 (CEST) From: Jens Schweikhardt <schweikh@schweikhardt.net> To: FreeBSD-gnats-submit@FreeBSD.org Subject: standards/54410: one-true-awk not POSIX compliant (no extended REs) Message-ID: <200307121031.h6CAVnIi093652@hal9000.schweikhardt.net> Resent-Message-ID: <200307121040.h6CAe9kr085263@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 54410 >Category: standards >Synopsis: one-true-awk not POSIX compliant (no extended REs) >Confidential: no >Severity: serious >Priority: high >Responsible: freebsd-standards >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sat Jul 12 03:40:09 PDT 2003 >Closed-Date: >Last-Modified: >Originator: Jens Schweikhardt >Release: FreeBSD 5.1-CURRENT i386 >Organization: Digital Details >Environment: System: FreeBSD hal9000.schweikhardt.net 5.1-CURRENT FreeBSD 5.1-CURRENT #0: Wed Jul 9 21:22:46 CEST 2003 toor@hal9000.schweikhardt.net:/usr/obj/share/src/HEAD/sys/HAL9000 i386 any >Description: Our /usr/bin/awk understands only basic RE, not Extended RE, as required by IEEE Std 1003.1-2001: References: <quote std="IEEE Std 1003.1-2001" section=awk> ... Regular Expressions The awk utility shall make use of the extended regular expression notation (see the Base Definitions volume of IEEE Std 1003.1-2001, Section 9.4, Extended Regular Expressions) </quote> <quote std="IEEE Std 1003.1-2001" section=ere> EREs Matching Multiple Characters ... 5. When an ERE matching a single character or an ERE enclosed in parentheses is followed by an interval expression of the format "{m}" , "{m,}" , or "{m,n}" , together with that interval expression it shall match what repeated consecutive occurrences of the ERE would match. The values of m and n are decimal integers in the range 0 <= m<= n<= {RE_DUP_MAX}, where m specifies the exact or minimum number of occurrences and n specifies the maximum number of occurrences. The expression "{m}" matches exactly m occurrences of the preceding ERE, "{m,}" matches at least m occurrences, and "{m,n}" matches any number of occurrences between m and n, inclusive. </quote> >How-To-Repeat: echo e | /usr/bin/awk '/e{1}/' # should print e, but prints nothing >Fix: It's probaly POLA violation to change the default RE style from BRE to ERE, but we should add a POSIX mode that uses BRE (e.g. gawk needs --posix to be compliant). >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200307121031.h6CAVnIi093652>