Date: Tue, 27 Sep 2011 15:36:24 +0930 From: Wayne Sierke <ws@au.dyndns.ws> To: grarpamp <grarpamp@gmail.com> Cc: freebsd-questions@freebsd.org Subject: Re: Regex Wizards Message-ID: <1317103584.2326.48.camel@predator-ii.buffyverse> In-Reply-To: <CAD2Ti29Uvz6tBp60SYnD-5bJ8Jf=ThbVG5UUU21NWmmqOrO5SA@mail.gmail.com> References: <CAD2Ti29Uvz6tBp60SYnD-5bJ8Jf=ThbVG5UUU21NWmmqOrO5SA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 2011-09-26 at 22:02 -0400, grarpamp wrote: > Under the ERE implementation in RELENG_8, I'm having > trouble figuring out how to group and backreference this. > > Given a line, where: > If AAA is present, CCC will be too, and B may appear in between. > If AAA is not present, neither CCC or B will be present. > DDDD is always present. > Junk may be present. > Match good lines and ouput in chunks. > > echo junkAAAABCCCDDDDjunk | \ > > This works as expected: > sed -E -n 's,^.*(AAAB?CCC)(DDDD).*$,1 \1 2 \2,p' > 1 AAABCCC 2 DDDD > > But making the leading bits optional per spec does not work: > sed -E -n 's,^.*(AAAB?CCC)?(DDDD).*$,1 \1 2 \2,p' > 1 2 DDDD > > Nor does adding the usual grouping parens: > sed -E -n 's,^.*((AAAB?CCC)?)(DDDD).*$,1 \1 2 \2,p' > 1 2 > > How do I group off the leading bits? > Or is this a limitation of ERE's? > Or a bug? > Thanks. I believe that the problem is the greediness of the leading '.*'. With the first grouping optional, its contents are consumed into the '.*'. This seems to work: sed -E -n -e '/AAAB?CCC/!s,.*(DDDD).*,1 \1,p' -e 's,.*(AAAB?CCC)(DDDD).*,1 \1 2 \2,p' %echo junkAABCCCDDDDjunk | sed ... 1 DDDD %echo junkAAAABCCCDDDDjunk | sed ... 1 AAABCCC 2 DDDD %echo junkAAAACCCDDDDjunk | sed ... 1 AAACCC 2 DDDD %echo junkAAAABCCDDDDjunk | sed ... 1 DDDD Wayne
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1317103584.2326.48.camel>