From owner-freebsd-bugs@freebsd.org Fri Apr 7 04:05:01 2017 Return-Path: Delivered-To: freebsd-bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BD6DAD327B0 for ; Fri, 7 Apr 2017 04:05:01 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id ACD1FD30 for ; Fri, 7 Apr 2017 04:05:01 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v37451LN046755 for ; Fri, 7 Apr 2017 04:05:01 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 166861] bsdgrep(1)/sed(1): bsdgrep -E and sed handle invalid {} constructs strangely Date: Fri, 07 Apr 2017 04:05:01 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: bin X-Bugzilla-Version: unspecified X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: bsdports@kyle-evans.net X-Bugzilla-Status: In Progress X-Bugzilla-Resolution: X-Bugzilla-Priority: Normal X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Apr 2017 04:05:01 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D166861 Kyle Evans changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |bsdports@kyle-evans.net --- Comment #2 from Kyle Evans --- (In reply to dubiousjim from comment #1) Summary of work needed at the bottom, feel free to skip ahead and only look back for intermediate results/notes. Some relevant notes: As of GNU grep 2.27, GNU SED 4.3 on Debian, and BSD grep @ r316566-ish: (1) and (2) behavior between the two seem to match (3)=20 FreeBSD: $ echo "a{1,2,3}b" | sed -r "s/{/_/" a_1,2,3}b $ echo "a{1,2,3}b" | sed -r "s/}/_/" a{1,2,3_b Debian: $ echo "a{1,2,3}b" | sed -r "s/{/_/" # Error, invalid preceding expression # Whoops $ echo "a{1,2,3}b" | sed -r "s/a{/_/" # Error, unmatched \{ $ echo "a{1,2,3}b" | sed -r "s/}/_/" a{1,2,3_b We do have a test case for this at lib/libc/regex/grot/tests:205 where { is explicitly meant to be a literal match in both BREs and EREs. We have no ca= se expression } being a literal match. FreeBSD: $ echo "a{1,2,3}b" | sed "s/\}/_/" # Error, parentheses not balanced Debian: $ echo "a{1,2,3}b" | sed "s/\}/_/" a{1,2,3_b # Ah, also prefer GNU behavior This one, it's worth noting, has no test either. It does have the obvious t= est for the other side, \{ alone, but no \}. (4) FreeBSD: $ echo "a{1,2,3}b" | sed -r "s/{}/_/" a{1,2,3}b Debian: $ echo "a{1,2,3}b" | sed -r "s/{}/_/" # Error, invalid preceding expression # Whoops $ echo "a{1,2,3}b" | sed -r "s/a{}/_/" # Error, invalid content # Reasonable This one is .... technically correct behavior. Technically, according to re_format(7), the following "}" is *not* a digit, and therefore this is not= a bounds statement. I think this is really not correct, though. Letting {} ta= ke a literal interpretation leaves us too much room for error getting in if a di= git was expected by the pattern-creator, and I would prefer the GNU approach on this matter. We'll probably want to update re_format(7) to be more explicit in this matt= er, as well as add a corresponding test case. (5) FreeBSD: $ echo "a{1,2,3}b" | sed -r "s/)/_/" a{1,2,3}b $ echo "a{1,2,3}b" | sed "s/\)/_/" # Error, parentheses not balanced This is clearly covered in tests:54 (silenced, though) and with slight anger expressed in the context around it. I lean towards taking the GNU/sane appr= oach on this one and making this work as one probably expects nowadays. =3D=3D=3D=3D=3D Summary of work needed (3) Problem: { in ERE uses literal interpretation Needed: { throw error Needed: Fix test case at tests:205 to separate out BRE and ERE cases and ad= just ERE case to meet expectations Problem: \} in BRE throws an error Needed: \} match literal (4) Problem: {} in ERE uses literal interpretation Needed: {} throw error Needed: Consider re_format(7) update to explicitly note {} as illegal Needed: Test case (5) Problem: ) in ERE should throw error Needed: ) throw error Needed: Adjust test cases (tests:54) I think that sums it up -- I'll take a look at these things in the next wee= k or so. --=20 You are receiving this mail because: You are the assignee for the bug.=