From owner-freebsd-questions@FreeBSD.ORG Thu Jan 1 00:57:23 2009 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 21C3F106564A for ; Thu, 1 Jan 2009 00:57:23 +0000 (UTC) (envelope-from kline@thought.org) Received: from aristotle.thought.org (ns1.thought.org [209.180.213.210]) by mx1.freebsd.org (Postfix) with ESMTP id C79588FC14 for ; Thu, 1 Jan 2009 00:57:22 +0000 (UTC) (envelope-from kline@thought.org) Received: from thought.org (tao.thought.org [10.47.0.250]) (authenticated bits=0) by aristotle.thought.org (8.14.2/8.14.2) with ESMTP id n010vkEA008144; Wed, 31 Dec 2008 16:57:50 -0800 (PST) (envelope-from kline@thought.org) Received: by thought.org (nbSMTP-1.00) for uid 1002 kline@thought.org; Wed, 31 Dec 2008 16:57:13 -0800 (PST) Date: Wed, 31 Dec 2008 16:57:09 -0800 From: Gary Kline To: Karl Vogel Message-ID: <20090101005709.GB875@thought.org> References: <20081230193111.GA32641@thought.org> <20081231202014.C8012BE14@kev.msw.wpafb.af.mil> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20081231202014.C8012BE14@kev.msw.wpafb.af.mil> User-Agent: Mutt/1.4.2.3i X-Organization: Thought Unlimited. Public service Unix since 1986. X-Of_Interest: With 22 years of service to the Unix community. X-Spam-Status: No, score=-3.4 required=3.6 tests=ALL_TRUSTED,AWL,BAYES_00, URIBL_BLACK autolearn=no version=3.2.3 X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on aristotle.thought.org Cc: freebsd-questions@freebsd.org Subject: Re: well, blew it... sed or perl q again. X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Jan 2009 00:57:23 -0000 On Wed, Dec 31, 2008 at 03:20:14PM -0500, Karl Vogel wrote: > >> On Tue, 30 Dec 2008 11:31:14 -0800, > >> Gary Kline said: > > G> The problem is that there are many, _many_ embedded " G> HREF="http://whatever> Site in my hundreds, or thousands, or > G> files. I only want to delete the "http://" lines, _not_ > G> the other Href links. > > Use perl. You'll want the "i" option to do case-insensitive matching, > plus "m" for matching that could span multiple lines; the first > quoted line above shows one of several places where a URL can cross > a line-break. > > You might want to leave the originals completely alone. I never trust > programs to modify files in place: > > you% mkdir /tmp/work > you% find . -type f -print | xargs grep -li http://junkfoo.com > FILES > you% pax -rwdv -pe /tmp/work < FILES ^^^ pax is like cpio, isn't it? anyway, yes, i'll ponder this. i [mis]-spent hours undoing something bizarre that my scrub.c binary did to directories, turning foo and bar, (and scores more) into foo and foo.bar, bar and bar.bak. the bak were the saved directories. the foo, bar were bizarre. i couldn't write/cp/mv over them. had to carefully rm -f foo; mv foo.bar foo.... [et cetera]...... then i scp'd my files to two other computers. (*mumcle) > > Your perl script can just read FILES and overwrite the stuff in the new > directory. You'll want to slurp the entire file into memory so you catch > any URL that spans multiple lines. Try the script below, it works for > input like this: > > This > > Site should go away too. > > And so should > "http://junkfoo.com/" > > Site this > > And finally Site this > > -- > Karl Vogel I don't speak for the USAF or my company > > The average person falls asleep in seven minutes. > --item for a lull in conversation > > --------------------------------------------------------------------------- > #!/usr/bin/perl -w > > use strict; > > my $URL = 'href=(.*?)"http://junkfoo.com/*"'; > my $contents; > my $fh; > my $infile; > my $outfile; > > while (<>) { > chomp; > $infile = $_; > > s{^./}{/tmp/}; > $outfile = $_; > > open ($fh, "< $infile") or die "$infile"; > $contents = do { local $/; <$fh> }; > close ($fh); > > $contents =~ s{ # substitute ... > $URL # ... actual link > (.*?) # ... min # of chars including newline > # ... until we end > } > { }gixms; # ... with a single space > > open ($fh, "> $outfile") or die "$outfile"; > print $fh $contents; > close ($fh); > } > > exit(0); > _______________________________________________ > freebsd-questions@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org" -- Gary Kline kline@thought.org http://www.thought.org Public Service Unix http://jottings.thought.org http://transfinite.thought.org The 2.17a release of Jottings: http://jottings.thought.org/index.php