From owner-freebsd-questions@FreeBSD.ORG Tue Dec 30 22:07:20 2008 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 89010106564A for ; Tue, 30 Dec 2008 22:07:20 +0000 (UTC) (envelope-from kline@thought.org) Received: from aristotle.thought.org (ns1.thought.org [209.180.213.210]) by mx1.freebsd.org (Postfix) with ESMTP id 3D8668FC0C for ; Tue, 30 Dec 2008 22:07:20 +0000 (UTC) (envelope-from kline@thought.org) Received: from thought.org (tao.thought.org [10.47.0.250]) (authenticated bits=0) by aristotle.thought.org (8.14.2/8.14.2) with ESMTP id mBUM7lUG097925; Tue, 30 Dec 2008 14:07:47 -0800 (PST) (envelope-from kline@thought.org) Received: by thought.org (nbSMTP-1.00) for uid 1002 kline@thought.org; Tue, 30 Dec 2008 14:07:12 -0800 (PST) Date: Tue, 30 Dec 2008 14:07:12 -0800 From: Gary Kline To: Giorgos Keramidas Message-ID: <20081230220711.GA47453@thought.org> References: <20081230193111.GA32641@thought.org> <20081230201623.GB65659@slackbox.xs4all.nl> <20081230205131.GA34211@thought.org> <87abad4bk6.fsf@kobe.laptop> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87abad4bk6.fsf@kobe.laptop> User-Agent: Mutt/1.4.2.3i X-Organization: Thought Unlimited. Public service Unix since 1986. X-Of_Interest: With 22 years of service to the Unix community. X-Spam-Status: No, score=-4.4 required=3.6 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.2.3 X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on aristotle.thought.org Cc: Roland Smith , FreeBSD Mailing List Subject: Re: well, blew it... sed or perl q again. X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Dec 2008 22:07:21 -0000 On Tue, Dec 30, 2008 at 11:07:05PM +0200, Giorgos Keramidas wrote: > On Tue, 30 Dec 2008 12:51:31 -0800, Gary Kline wrote: > > All right, then is this the right syntax. In other words, do > > I need the double quotes to match the "http:" string? > > > > perl -pi.bak -e 'print unless "/m/http:/" || eof; close ARGV if eof' * > > Close, but not exactly right... > > You have to keep in mind that the argument to -e is a Perl expression, > i.e. something you might type as part of a script that looks like this: > > #!/usr/bin/perl > > while () { > YOUR-EXPRESSION-HERE; > print $_; > } > > One of the ways to print only the lines that do *not* match the > "http://" pattern is: > > print unless (m/http:\/\//); > > Note how the '/' characters that are part of the m/.../ expression need > extra backslashes to quote them. You can avoid this by using another > character for the m/.../ expression delimiter, like: i've used '%' rather than bangs because i wasn't sure if the bang might make the shell have a fit; great to know it won't:_) [i try to avoid escapes when i can... .] > > print unless (m!http://!); > > But you are not still done. The while loop above already contains a > print statement _outside_ of your expression. So if you add this to a > perl -p -e '...' invocation you are asking Perl to run this code: > > #!/usr/bin/perl > > while () { > print unless (m!http://!); > print $_; > } > > Each line of input will be printed _anyway_, but you will be duplicating > all the non-http lines. Use -n instead of -p to fix that: > > perl -n -e 'print unless (m!http://!)' > ahhhm, that's what happened last night. i would up with dup lines (2) pointing me to different links. had no clue. fortunately i had the .bak! > A tiny detail that may be useful is that "http://" is not required to be > lowercase in URIs. It may be worth adding the 'i' modifier after the > second '!' of the URI matching expression: > > perl -n -e 'print unless (m!http://!i)' > > Once you have that sort-of-working, it may be worth investigating more > elaborate URI matching regexps, because this will match far too much > (including, for instance, all the non-URI lines of this email that > contain the regexp example itself). i shall check try a grep -ri http * >/usr/tmp/g.out and see what turns up. thanks much, gary > -- Gary Kline kline@thought.org http://www.thought.org Public Service Unix http://jottings.thought.org http://transfinite.thought.org The 2.17a release of Jottings: http://jottings.thought.org/index.php