From owner-freebsd-bugs Wed Sep 6 16:02:57 1995 Return-Path: bugs-owner Received: (from majordom@localhost) by freefall.freebsd.org (8.6.11/8.6.6) id QAA26282 for bugs-outgoing; Wed, 6 Sep 1995 16:02:57 -0700 Received: from wcarchive.cdrom.com (wcarchive.cdrom.com [192.216.191.11]) by freefall.freebsd.org (8.6.11/8.6.6) with ESMTP id QAA26276 for ; Wed, 6 Sep 1995 16:02:56 -0700 Received: from narq.avian.org ([199.103.168.126]) by wcarchive.cdrom.com (8.6.11/8.6.9) with ESMTP id QAA15708 for ; Wed, 6 Sep 1995 16:04:56 -0700 Received: (from hobbit@localhost) by narq.avian.org (8.6.12/_H*) id RAA08173; Wed, 6 Sep 1995 17:55:24 -0400 Date: Wed, 6 Sep 1995 17:55:24 -0400 From: *Hobbit* Message-Id: <199509062155.RAA08173@narq.avian.org> To: freebsd-bugs@wcarchive.cdrom.com Subject: sed bug? Sender: bugs-owner@FreeBSD.org Precedence: bulk I have stumbled across an inconsistency in FreeBSD 'sed'. On every other platform I have access to, the regex `/[/|]/' to mean "slash or pipe" is valid, but the FreeBSD version errors out on this. Trying to quote the embedded slash, a la `/[\/|]/', avoids the error, but that breaks on other systems that match on `\` as well. One or the other is clearly wrong. Believe it or not, I have scripts that use regexes of this sort [such as things to parse sendmail logs and look for potentially evil activity!!], so having to modify the regex for one platform is annoying. Here's a script of various tests: % cat sed.test line with \backslash -- should print line with |pipe -- should be deleted line with /slash -- should be deleted % % ./sed.freebsd '/[/|]/d' sed.test sed: 1: "/[/|]/d": RE error: brackets ([ ]) not balanced % % ./sed.freebsd '/[\/|]/d' sed.test line with \backslash -- should print % % ./sed.bsdi '/[/|]/d' sed.test line with \backslash -- should print % % ./sed.bsdi '/[\/|]/d' sed.test You can see that 'sed.bsdi' is exhibiting what I believe is correct behavior, while 'sed.freebsd' is not. I am not exactly sure what the Official Defined Way to do this is, and I understand that some regex standards may have changed a while back. But I've always assumed that anything inside [], except ^ and - as first character, is taken literally. There is nothing to the contrary in the manpages for regex or sed. The offending code appears to be in usr.bin/sed/compile.c, in routine compile_delimited. The BSDI version has an "inbra" variable that is presumably used to check and correct for exactly this situation. This addition apparently vanished out of the FreeBSD code base. If I'm right about this, can it be put on the fixlist for the next release? Working code can be found in the bsdi sources; I can send along a diff if need be. _H*