Date: Tue, 30 Dec 2008 14:07:12 -0800 From: Gary Kline <kline@thought.org> To: Giorgos Keramidas <keramida@ceid.upatras.gr> Cc: Roland Smith <rsmith@xs4all.nl>, FreeBSD Mailing List <freebsd-questions@freebsd.org> Subject: Re: well, blew it... sed or perl q again. Message-ID: <20081230220711.GA47453@thought.org> In-Reply-To: <87abad4bk6.fsf@kobe.laptop> References: <20081230193111.GA32641@thought.org> <20081230201623.GB65659@slackbox.xs4all.nl> <20081230205131.GA34211@thought.org> <87abad4bk6.fsf@kobe.laptop>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Dec 30, 2008 at 11:07:05PM +0200, Giorgos Keramidas wrote: > On Tue, 30 Dec 2008 12:51:31 -0800, Gary Kline <kline@thought.org> wrote: > > All right, then is this the right syntax. In other words, do > > I need the double quotes to match the "http:" string? > > > > perl -pi.bak -e 'print unless "/m/http:/" || eof; close ARGV if eof' * > > Close, but not exactly right... > > You have to keep in mind that the argument to -e is a Perl expression, > i.e. something you might type as part of a script that looks like this: > > #!/usr/bin/perl > > while (<STDIN>) { > YOUR-EXPRESSION-HERE; > print $_; > } > > One of the ways to print only the lines that do *not* match the > "http://" pattern is: > > print unless (m/http:\/\//); > > Note how the '/' characters that are part of the m/.../ expression need > extra backslashes to quote them. You can avoid this by using another > character for the m/.../ expression delimiter, like: i've used '%' rather than bangs because i wasn't sure if the bang might make the shell have a fit; great to know it won't:_) [i try to avoid escapes when i can... .] > > print unless (m!http://!); > > But you are not still done. The while loop above already contains a > print statement _outside_ of your expression. So if you add this to a > perl -p -e '...' invocation you are asking Perl to run this code: > > #!/usr/bin/perl > > while (<STDIN>) { > print unless (m!http://!); > print $_; > } > > Each line of input will be printed _anyway_, but you will be duplicating > all the non-http lines. Use -n instead of -p to fix that: > > perl -n -e 'print unless (m!http://!)' > ahhhm, that's what happened last night. i would up with dup lines (2) pointing me to different links. had no clue. fortunately i had the .bak! > A tiny detail that may be useful is that "http://" is not required to be > lowercase in URIs. It may be worth adding the 'i' modifier after the > second '!' of the URI matching expression: > > perl -n -e 'print unless (m!http://!i)' > > Once you have that sort-of-working, it may be worth investigating more > elaborate URI matching regexps, because this will match far too much > (including, for instance, all the non-URI lines of this email that > contain the regexp example itself). i shall check try a grep -ri http * >/usr/tmp/g.out and see what turns up. thanks much, gary > -- Gary Kline kline@thought.org http://www.thought.org Public Service Unix http://jottings.thought.org http://transfinite.thought.org The 2.17a release of Jottings: http://jottings.thought.org/index.php
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20081230220711.GA47453>