From owner-freebsd-questions@FreeBSD.ORG Tue Sep 27 06:58:07 2011 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B6E151065672 for ; Tue, 27 Sep 2011 06:58:07 +0000 (UTC) (envelope-from joost@jodocus.org) Received: from fep32.mx.upcmail.net (fep32.mx.upcmail.net [62.179.121.50]) by mx1.freebsd.org (Postfix) with ESMTP id 19DF38FC16 for ; Tue, 27 Sep 2011 06:58:06 +0000 (UTC) Received: from edge01.upcmail.net ([192.168.13.236]) by viefep15-int.chello.at (InterMail vM.8.01.02.02 201-2260-120-106-20100312) with ESMTP id <20110927062511.OKS1538.viefep15-int.chello.at@edge01.upcmail.net>; Tue, 27 Sep 2011 08:25:11 +0200 Received: from bps.jodocus.org ([80.57.21.7]) by edge01.upcmail.net with edge id diR91h01309A8k001iRA1D; Tue, 27 Sep 2011 08:25:11 +0200 X-SourceIP: 80.57.21.7 Received: from webmail.jodocus.org (localhost [IPv6:::1]) by bps.jodocus.org (8.14.4/8.14.4) with ESMTP id p8R6P8tM011541; Tue, 27 Sep 2011 08:25:08 +0200 (CEST) (envelope-from joost@jodocus.org) Received: from 212.203.12.51 (SquirrelMail authenticated user joost) by webmail.jodocus.org with HTTP; Tue, 27 Sep 2011 08:25:08 +0200 Message-ID: <8bfa48b43bf616c25623fec80fe3c6aa.squirrel@webmail.jodocus.org> Date: Tue, 27 Sep 2011 08:25:08 +0200 From: joost@jodocus.org To: "grarpamp" User-Agent: SquirrelMail/1.4.21 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.6 (bps.jodocus.org [IPv6:::1]); Tue, 27 Sep 2011 08:25:08 +0200 (CEST) X-Virus-Scanned: clamav-milter 0.96.4 at bps.jodocus.org X-Virus-Status: Clean X-Cloudmark-Analysis: v=1.1 cv=1spcbIYDqsXqpWho1FqXt/RH1HhH/N0x2ckrrSfPMwM= c=1 sm=0 a=QwxTMoT1xusA:10 a=dnUvBJjsEKQA:10 a=8nJEP1OIZ-IA:10 a=xqWC_Br6kY4A:10 a=e5RAAaomGBF_rjxmj0YA:9 a=BoVxtbw01dbn6mCfZuYA:7 a=wPNLvfGTeEIA:10 a=HpAAvcLHHh0Zw7uRqdWCyQ==:117 Cc: freebsd-questions@freebsd.org Subject: Re: Regex Wizards X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Sep 2011 06:58:07 -0000 > Under the ERE implementation in RELENG_8, I'm having > trouble figuring out how to group and backreference this. > Given a line, where: > If AAA is present, CCC will be too, and B may appear in between. If AAA is not present, neither CCC or B will be present. > DDDD is always present. > Junk may be present. > Match good lines and ouput in chunks. > echo junkAAAABCCCDDDDjunk | \ > This works as expected: > sed -E -n 's,^.*(AAAB?CCC)(DDDD).*$,1 \1 2 \2,p' > 1 AAABCCC 2 DDDD > But making the leading bits optional per spec does not work: > sed -E -n 's,^.*(AAAB?CCC)?(DDDD).*$,1 \1 2 \2,p' > 1 2 DDDD > Nor does adding the usual grouping parens: > sed -E -n 's,^.*((AAAB?CCC)?)(DDDD).*$,1 \1 2 \2,p' > 1 2 > How do I group off the leading bits? > Or is this a limitation of ERE's? > Or a bug? > Thanks. Regular expressions are greedy by default. .* is matching "junkAAAABCCC" in your second and third example. Try `sed -E -n 's,^(.*)(AAAB?CCC)?(DDDD).*$,1 \1 2 \2 3 \3,p'` and you'll see what I mean. In perl I'd tell you to use .*? instead of .* but I have no idea what the posix equivalent is if it exists. Hope this helps. Joost Bekkers