Date: Thu, 2 Aug 2012 08:04:47 -0600 (MDT) From: Warren Block <wblock@wonkity.com> To: RW <rwmaillists@googlemail.com> Cc: freebsd-questions@freebsd.org Subject: Re: buggy awk regex handling? Message-ID: <alpine.BSF.2.00.1208020759350.80875@wonkity.com> In-Reply-To: <20120802141738.62ef1e45@gumby.homeunix.com> References: <743721353.9443.1343906452119.JavaMail.sas1@172.29.249.242> <20120802141738.62ef1e45@gumby.homeunix.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 2 Aug 2012, RW wrote: > On Thu, 02 Aug 2012 13:20:52 +0200 > kaltheat wrote: > >> I tried to replace three letters with three letters by awk using the >> sub-routine. I assumed that my regular expression does mean the >> following: >> >> match if three letters of any letter of alphabet occurs anywhere in >> input >> >> $ echo AbC | awk '{sub(/[[:alpha:]]{3}/,"cBa"); print;}' >> AbC >> >> As you can see the result was unexpected. >> When I try doing it for at least one letter, it works: >> >> $ echo AbC | awk '{sub(/[[:alpha:]]+/,"cBa"); print;}' >> cBa >> ... >> What am I doing wrong? >> Or is awk buggy? > > Traditional awk implementations don't support {n}, but I think POSIX > implementations should. Using gawk instead of awk agrees with that. Printing the result of the sub (the number of substitutions performed) makes it a little more clear: % echo AbC | awk '{print sub(/[[:alpha:]]{3}/,"cBa"); print;}' 0 AbC % echo AbC | gawk '{print sub(/[[:alpha:]]{3}/,"cBa"); print;}' 1 cBa sed can handle it: % echo AbC | sed -E 's/[[:alpha:]]{3}/cBa/' cBa
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.00.1208020759350.80875>