From owner-freebsd-current@FreeBSD.ORG Sun Nov 14 23:40:01 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8AA8316A4CE; Sun, 14 Nov 2004 23:40:01 +0000 (GMT) Received: from hotmail.com (bay2-dav16.bay2.hotmail.com [65.54.246.120]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6372643D1F; Sun, 14 Nov 2004 23:40:01 +0000 (GMT) (envelope-from tssajo@hotmail.com) Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC; Sun, 14 Nov 2004 15:40:01 -0800 Received: from 24.24.201.219 by BAY2-DAV16.phx.gbl with DAV; Sun, 14 Nov 2004 23:39:00 +0000 X-Originating-IP: [24.24.201.219] X-Originating-Email: [tssajo@hotmail.com] X-Sender: tssajo@hotmail.com From: "Zoltan Frombach" To: , Date: Sun, 14 Nov 2004 15:39:00 -0800 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.2180 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 Message-ID: X-OriginalArrivalTime: 14 Nov 2004 23:40:01.0086 (UTC) FILETIME=[421C7DE0:01C4CAA3] Subject: Either I do something wrong or there is a regexp bug in sed !! X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Nov 2004 23:40:01 -0000 I'm trying to use sed under FreeBSD 5.3-RELEASE in a new 'netqmail' port I am currently working on. I want to replace a bunch of digits (in plain English: a decimal number) in a text file at the beginning of a line. Here is how the original file looks before I do anything (this file is part of the netqmail-1.05 package, but it is unimportant): --- file conf-split begins 23 This is the queue subdirectory split. --- file conf-split ends Okay, so I try to replace 23 (or whatever number is there!) at the beginning of the first line to let's say 199 in this file using sed. I would expect this to work: sed -e "s/^[0-9]+/199/" conf-split > conf-split.new But it doesn't change anything in conf-spilt.new!! My regexp ^[0-9]+ doesn't match anything! After spending like an hour investigating this, I realized that the + after my bracket expression ( I'm talking about this part here: [0-9]+ ) does not match! If I omit the use of + and use * instead, I can make my regexp to match. So this works - but IMHO it's ugly: sed -e "s/^[0-9][0-9]*/199/" conf-split > conf-split.new It gives this output, which is what I always wanted: --- file conf-split.new begins 199 This is the queue subdirectory split. --- file conf-split.new ends According to the sed man page, the regexp syntax that is used by sed is documented in the re_format man page. And according to the re_format man page: "A piece is an atom possibly followed by a single= `*', `+', `?', or bound. An atom followed by `*' matches a sequence of 0 or more matches of the atom. An atom followed by `+' matches a sequence of 1 or more matches of the atom. ..." And the definition of an "atom" is (quoted from the same man page): "An atom is a regular expression enclosed in `()' (matching a match for the regular expression), an empty set of `()' (matching the null string)=, a bracket expression (see below) ..." So either my bracket expression ( [0-9] ) in my first sed command was not recognized as an atom, or if it was recognized as an atom then the + that followed it was not interpreted properly... Can anyone please tell me why? I believe this is a bug in sed or in the regexp library which sed uses. If it is a regexp library issue, then there is a chance that it affects other programs that use it, as well! At least it can break all programs that use sed regexps, especially ports... My uname -a is: FreeBSD www.xxxxxxxx.com 5.3-RELEASE FreeBSD 5.3-RELEASE #0: Fri Nov 12 01:07:41 PST 2004 xxx@www.xxxxxxxx.com:/usr/obj/usr/src/sys/XXXXXXXX i386 Zoltan