Date: Wed, 11 Apr 2012 22:59:55 GMT From: Jim Pryor <dubiousjim@gmail.com> To: freebsd-gnats-submit@FreeBSD.org Subject: bin/166861: bsdgrep -E and sed handle invalid {} constructs strangely Message-ID: <201204112259.q3BMxt4O052971@red.freebsd.org> Resent-Message-ID: <201204112300.q3BN0Skn077710@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 166861 >Category: bin >Synopsis: bsdgrep -E and sed handle invalid {} constructs strangely >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: change-request >Submitter-Id: current-users >Arrival-Date: Wed Apr 11 23:00:28 UTC 2012 >Closed-Date: >Last-Modified: >Originator: Jim Pryor >Release: 9.0-PRELEASE >Organization: >Environment: FreeBSD vaio.jimpryor.net 9.0-PRERELEASE FreeBSD 9.0-PRERELEASE #0: Tue Nov 29 02:45:33 EST 2011 root@vaio.jimpryor.net:/usr/obj/usr/src/sys/MINE amd64 >Description: grep version line: /* $FreeBSD: src/usr.bin/grep/grep.c,v 1.11.2.3 2011/10/20 16:08:11 gabor Exp $ sed version line: $FreeBSD: src/usr.bin/sed/main.c,v 1.45.2.1 2011/09/23 00:51:37 kensmith Exp $ (1) FreeBSD grep without -E will reject unmatched '\{' as an invalid pattern, but treat unmatched '\}' as a literal '}'. So far, so good. This is also how Gnu grep and BusyBox grep handle these; POSIX-2008 doesn't specify what to do here. BusyBox's egrep sticks to the same pattern. But FreeBSD's egrep diverges: it treats unmatched { and unmatched } both as literals. These are perverse patterns and no one should be relying on this behavior; however, FreeBSD's change of behavior here seems unmotivated. Admittedly, Gnu egrep does the same as FreeBSD. (2) FreeBSD grep without -E follows the other greps in rejecting 'a\{1,2,3\}b' as an invalid pattern. The other egreps likewise reject 'a{1,2,3}b'. However, FreeBSD grep accepts 'a{1,2,3}b', and moreover will match it against the text "a{1,2,3}b"; however, the match is zero-length. Again, a perverse pattern whose interpretation no one should be relying on. However, FreeBSD's handling of it seems strange. (3) The pattern among other sed implementations is: without -r: reject unmatched \{ as error, accept unmatched \} as literal reject \{\}, \{2,1\}, and \{1,2,3\} with -r: reject unmatched { as error, accept unmatched } as literal reject {}, {2,1}, and {1,2,3} However, FreeBSD sed without -r diverges from the pattern in rejecting unmatched \} as error. (4) Also, FreeBSD sed with -r diverges from the pattern in accepting {} as those two literal characters. >How-To-Repeat: See above. >Fix: >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201204112259.q3BMxt4O052971>