From owner-freebsd-questions@FreeBSD.ORG Tue Dec 30 22:18:54 2008 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 91DC6106566C for ; Tue, 30 Dec 2008 22:18:53 +0000 (UTC) (envelope-from kline@thought.org) Received: from aristotle.thought.org (aristotle.thought.org [209.180.213.210]) by mx1.freebsd.org (Postfix) with ESMTP id 2F8A38FC0C for ; Tue, 30 Dec 2008 22:18:53 +0000 (UTC) (envelope-from kline@thought.org) Received: from thought.org (tao.thought.org [10.47.0.250]) (authenticated bits=0) by aristotle.thought.org (8.14.2/8.14.2) with ESMTP id mBUMJHkb097983; Tue, 30 Dec 2008 14:19:17 -0800 (PST) (envelope-from kline@thought.org) Received: by thought.org (nbSMTP-1.00) for uid 1002 kline@thought.org; Tue, 30 Dec 2008 14:18:41 -0800 (PST) Date: Tue, 30 Dec 2008 14:18:41 -0800 From: Gary Kline To: Roland Smith Message-ID: <20081230221841.GA46220@thought.org> References: <20081230193111.GA32641@thought.org> <20081230201623.GB65659@slackbox.xs4all.nl> <20081230205131.GA34211@thought.org> <20081230211642.GA67769@slackbox.xs4all.nl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20081230211642.GA67769@slackbox.xs4all.nl> User-Agent: Mutt/1.4.2.3i X-Organization: Thought Unlimited. Public service Unix since 1986. X-Of_Interest: With 22 years of service to the Unix community. X-Spam-Status: No, score=-2.4 required=3.6 tests=ALL_TRUSTED,BAYES_00, URIBL_BLACK autolearn=no version=3.2.3 X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on aristotle.thought.org Cc: FreeBSD Mailing List Subject: Re: well, blew it... sed or perl q again. X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Dec 2008 22:18:55 -0000 On Tue, Dec 30, 2008 at 10:16:42PM +0100, Roland Smith wrote: > On Tue, Dec 30, 2008 at 12:51:31PM -0800, Gary Kline wrote: > > On Tue, Dec 30, 2008 at 09:16:23PM +0100, Roland Smith wrote: > > > On Tue, Dec 30, 2008 at 11:31:14AM -0800, Gary Kline wrote: > > > > The problem is that there are many, _many_ embedded > > > > "" lines, _not_ the other Href links. > > > > > > > > Which would be best to use, given that a backup is critical? > > > > sed or perl? > > > > > > IMHO, perl with the -i option to do in-place editing with backups. You > > > could also use the -p option to loop over files. See perlrun(1). > > > > > > Roland > > > > > > All right, then is this the right syntax. In other words, do > > I need the double quotes to match the "http:" string? > > > > perl -pi.bak -e 'print unless "/m/http:/" || eof; close ARGV if eof' * > > You don't need the quotes (if the command doesn't contain anything that > your shell would eat/misuse/replace). See perlop(1). i have, thanks; getting more clues... . > > This will disregard the entire line with a URI in it. Is this really > what you want? exactly; anything that has http://WHATEVER i do not want to copy. the slight gotcha is if the LIne // tag is on the folowing line. but in most cases the whole anchor, close-anchor of these junk lines is on one line. ...i know a closing tag does nothing; it's just sloppy markup. > > Copy some of the files you want to scrub to a separate directory, and > run tests to see if your script works: > > mkdir mytest; cp foo mytest/; cd mytest; perl -pi.bak ../scrub.pl foo > diff -u foo foo.bak thanks much to you and giorgos. i thought about doing this by-hand, but only for about 0.01s! gary > > Roland > -- > R.F.Smith http://www.xs4all.nl/~rsmith/ > [plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated] > pgp: 1A2B 477F 9970 BA3C 2914 B7CE 1277 EFB0 C321 A725 (KeyID: C321A725) -- Gary Kline kline@thought.org http://www.thought.org Public Service Unix http://jottings.thought.org http://transfinite.thought.org The 2.17a release of Jottings: http://jottings.thought.org/index.php