Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 30 Dec 2008 14:07:12 -0800
From:      Gary Kline <kline@thought.org>
To:        Giorgos Keramidas <keramida@ceid.upatras.gr>
Cc:        Roland Smith <rsmith@xs4all.nl>, FreeBSD Mailing List <freebsd-questions@freebsd.org>
Subject:   Re: well, blew it... sed or perl q again.
Message-ID:  <20081230220711.GA47453@thought.org>
In-Reply-To: <87abad4bk6.fsf@kobe.laptop>
References:  <20081230193111.GA32641@thought.org> <20081230201623.GB65659@slackbox.xs4all.nl> <20081230205131.GA34211@thought.org> <87abad4bk6.fsf@kobe.laptop>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Dec 30, 2008 at 11:07:05PM +0200, Giorgos Keramidas wrote:
> On Tue, 30 Dec 2008 12:51:31 -0800, Gary Kline <kline@thought.org> wrote:
> > 	All right, then is this the right syntax.  In other words, do
> > 	I need the double quotes to match the "http:" string?
> >
> >   perl -pi.bak -e 'print unless "/m/http:/" || eof; close ARGV if eof' *
> 
> Close, but not exactly right...
> 
> You have to keep in mind that the argument to -e is a Perl expression,
> i.e. something you might type as part of a script that looks like this:
> 
>     #!/usr/bin/perl
> 
>     while (<STDIN>) {
>         YOUR-EXPRESSION-HERE;
>         print $_;
>     }
> 
> One of the ways to print only the lines that do *not* match the
> "http://" pattern is:
> 
>     print unless (m/http:\/\//);
> 
> Note how the '/' characters that are part of the m/.../ expression need
> extra backslashes to quote them.  You can avoid this by using another
> character for the m/.../ expression delimiter, like:


	i've used '%' rather than bangs because i wasn't sure if the
	bang might make the shell have a fit; great to know it
	won't:_)   [i try to avoid escapes when i can... .]


> 
>     print unless (m!http://!);
> 
> But you are not still done.  The while loop above already contains a
> print statement _outside_ of your expression.  So if you add this to a
> perl -p -e '...' invocation you are asking Perl to run this code:
> 
>     #!/usr/bin/perl
> 
>     while (<STDIN>) {
>         print unless (m!http://!);
>         print $_;
>     }
> 
> Each line of input will be printed _anyway_, but you will be duplicating
> all the non-http lines.  Use -n instead of -p to fix that:
> 
>     perl -n -e 'print unless (m!http://!)'
> 


	ahhhm, that's what happened last night.  i would up with dup
	lines (2) pointing me to different links.  had no clue.
	fortunately i had the .bak!


> A tiny detail that may be useful is that "http://" is not required to be
> lowercase in URIs.  It may be worth adding the 'i' modifier after the
> second '!' of the URI matching expression:
> 
>     perl -n -e 'print unless (m!http://!i)'
> 
> Once you have that sort-of-working, it may be worth investigating more
> elaborate URI matching regexps, because this will match far too much
> (including, for instance, all the non-URI lines of this email that
> contain the regexp example itself).

	i shall check try a grep -ri http * >/usr/tmp/g.out
	and see what turns up.  thanks much,

	gary

> 

-- 
 Gary Kline  kline@thought.org  http://www.thought.org  Public Service Unix
        http://jottings.thought.org   http://transfinite.thought.org
    The 2.17a release of Jottings: http://jottings.thought.org/index.php




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20081230220711.GA47453>