Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 11 Apr 2012 22:59:55 GMT
From:      Jim Pryor <dubiousjim@gmail.com>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   bin/166861: bsdgrep -E and sed handle invalid {} constructs strangely
Message-ID:  <201204112259.q3BMxt4O052971@red.freebsd.org>
Resent-Message-ID: <201204112300.q3BN0Skn077710@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         166861
>Category:       bin
>Synopsis:       bsdgrep -E and sed handle invalid {} constructs strangely
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          change-request
>Submitter-Id:   current-users
>Arrival-Date:   Wed Apr 11 23:00:28 UTC 2012
>Closed-Date:
>Last-Modified:
>Originator:     Jim Pryor
>Release:        9.0-PRELEASE
>Organization:
>Environment:
FreeBSD vaio.jimpryor.net 9.0-PRERELEASE FreeBSD 9.0-PRERELEASE #0: Tue Nov 29 02:45:33 EST 2011 root@vaio.jimpryor.net:/usr/obj/usr/src/sys/MINE amd64
>Description:
grep version line:
/* $FreeBSD: src/usr.bin/grep/grep.c,v 1.11.2.3 2011/10/20 16:08:11 gabor Exp $

sed version line:
$FreeBSD: src/usr.bin/sed/main.c,v 1.45.2.1 2011/09/23 00:51:37 kensmith Exp $

(1) FreeBSD grep without -E will reject unmatched '\{' as an invalid pattern, but treat unmatched '\}' as a literal '}'. So far, so good. This is also how Gnu grep and BusyBox grep handle these; POSIX-2008 doesn't specify what to do here.

BusyBox's egrep sticks to the same pattern. But FreeBSD's egrep diverges: it treats unmatched { and unmatched } both as literals. These are perverse patterns and no one should be relying on this behavior; however, FreeBSD's change of behavior here seems unmotivated. Admittedly, Gnu egrep does the same as FreeBSD.

(2) FreeBSD grep without -E follows the other greps in rejecting 'a\{1,2,3\}b' as an invalid pattern. The other egreps likewise reject 'a{1,2,3}b'. However, FreeBSD grep accepts 'a{1,2,3}b', and moreover will match it against the text "a{1,2,3}b"; however, the match is zero-length. Again, a perverse pattern whose interpretation no one should be relying on. However, FreeBSD's handling of it seems strange.

(3) The pattern among other sed implementations is:
     without -r: reject unmatched \{ as error, accept unmatched \} as literal
                 reject \{\}, \{2,1\}, and \{1,2,3\}
        with -r: reject unmatched { as error, accept unmatched } as literal
                 reject {}, {2,1}, and {1,2,3}

However, FreeBSD sed without -r diverges from the pattern in rejecting unmatched \} as error.

(4) Also, FreeBSD sed with -r diverges from the pattern in accepting {} as those two literal characters.

>How-To-Repeat:
See above.
>Fix:


>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201204112259.q3BMxt4O052971>