Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 30 Dec 2008 15:07:20 -0600
From:      David Kelly <dkelly@hiwaay.net>
To:        Gary Kline <kline@thought.org>
Cc:        Roland Smith <rsmith@xs4all.nl>, FreeBSD Mailing List <freebsd-questions@freebsd.org>
Subject:   Re: well, blew it... sed or perl q again.
Message-ID:  <20081230210720.GA65861@Grumpy.DynDNS.org>
In-Reply-To: <20081230205131.GA34211@thought.org>
References:  <20081230193111.GA32641@thought.org> <20081230201623.GB65659@slackbox.xs4all.nl> <20081230205131.GA34211@thought.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Dec 30, 2008 at 12:51:31PM -0800, Gary Kline wrote:
> On Tue, Dec 30, 2008 at 09:16:23PM +0100, Roland Smith wrote:
> > On Tue, Dec 30, 2008 at 11:31:14AM -0800, Gary Kline wrote:
> > > 	The problem is that there are many, _many_ embedded 
> > > 	"<A HREF="http://whatever>; Site</A> in my hundreds, or
> > > 	thousands, or files.  I only want to delete the
> > > 	"http://<junkfoo.com>" lines, _not_ the other Href links.
> > > 
> > > 	Which would be best to use, given that a backup is critical?
> > > 	sed or perl?
> > 
> > IMHO, perl with the -i option to do in-place editing with backups. You
> > could also use the -p option to loop over files. See perlrun(1).
> > 
> > Roland
> 
> 
> 	All right, then is this the right syntax.  In other words, do
> 	I need the double quotes to match the "http:" string?
> 
>   perl -pi.bak -e 'print unless "/m/http:/" || eof; close ARGV if eof' *

In years past I used fetch(1) to download the day's page from a comic
strip site, awk to extract the URL of the day's comic strip, and fetch
again to put a copy of the comic strip in my archive. This application
sounds similar.

-- 
David Kelly N4HHE, dkelly@HiWAAY.net
========================================================================
Whom computers would destroy, they must first drive mad.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20081230210720.GA65861>