From owner-freebsd-standards@FreeBSD.ORG Mon Oct 11 19:09:16 2004 Return-Path: Delivered-To: freebsd-standards@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6266C16A4CE for ; Mon, 11 Oct 2004 19:09:16 +0000 (GMT) Received: from bremen.shuttle.de (bremen.shuttle.de [194.95.249.251]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3558E43D1F for ; Mon, 11 Oct 2004 19:09:15 +0000 (GMT) (envelope-from schweikh@schweikhardt.net) Received: by bremen.shuttle.de (Postfix, from userid 10) id 99E473B8C7; Mon, 11 Oct 2004 21:09:13 +0200 (CEST) Received: from hal9000.schweikhardt.net (localhost [127.0.0.1]) i9BJ90Zr001664; Mon, 11 Oct 2004 21:09:00 +0200 (CEST) (envelope-from schweikh@hal9000.schweikhardt.net) Received: (from schweikh@localhost) by hal9000.schweikhardt.net (8.13.1/8.13.1/Submit) id i9BJ90UC001663; Mon, 11 Oct 2004 21:09:00 +0200 (CEST) (envelope-from schweikh) Date: Mon, 11 Oct 2004 21:09:00 +0200 From: Jens Schweikhardt To: "Kamal R. Prasad" Message-ID: <20041011190900.GA1278@schweikhardt.net> References: <20040930173735.25F3A2AC79@beowulf.gw.com> <4168E329.90704@kprasad.org> <20041010165549.GA11517@schweikhardt.net> <416A6CA3.1000507@kprasad.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <416A6CA3.1000507@kprasad.org> User-Agent: Mutt/1.5.6i cc: freebsd-standards@freebsd.org Subject: Re: standards/54410 (awk command) X-BeenThere: freebsd-standards@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Standards compliance List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Oct 2004 19:09:16 -0000 Kamal, On Mon, Oct 11, 2004 at 04:51:07PM +0530, Kamal R. Prasad wrote: # -------------------------------------------------- # # *How-To-Repeat* # # echo e | /usr/bin/awk '/e{1}/' # should print e, but prints # nothing # # # # *Fix* # <> It's probaly POLA violation to change the default RE style from # BRE to ERE, but we should add a POSIX mode that uses BRE (e.g. # gawk needs --posix to be compliant). # <>------------------------------------------------------------- # # I can fix this -but that would change the traditional behaviour. Your # idea of adding a --posix flag may not be appropriate because POSIX # requires the specified behaviour as default behaviour. # i.e. a posix compliant awk script would break because it expects the # code to be fully portable across unix'es. Let me know how it goes. I just looked at our awk(1) man page, and it explicitly says that Regular expressions are as in egrep; see grep(1). And in fact, patterns like /(a|b)/ do work as expected. It appears only the quantifiers {n}, {n,}, {,m} and {n,m} are not implemented. This is where I noticed the POSIX deviation. I want to ask a wider audience how to fix this, thus cc to standards@. Some of the options: 1. Do nothing and document the missing {} quantifiers in awk(1)'s BUGS. 2. Use some environment variable (POSIXLY_CORRECT?) if {} should be handled like a proper ERE and remain bug compatible to old behavior if not. 3. Add {} unconditionally at the risk of breaking awk scripts and point users to awk(1) where it says this should always have been like this. Place prominent note in UPGRADING. Hah, as if anyone reads that :-) 4. Your opinion here. Regards, Jens -- Jens Schweikhardt http://www.schweikhardt.net/ SIGSIG -- signature too long (core dumped)