Date: Sun, 20 Jul 2008 10:10:11 +0200 From: Jonathan McKeown <jonathan+freebsd-questions@hst.org.za> To: freebsd-questions@freebsd.org Subject: Re: How to divide up? Message-ID: <200807201010.11568.jonathan%2Bfreebsd-questions@hst.org.za> In-Reply-To: <20080720063746.GB21826@thought.org> References: <20080720002345.GA9173@thought.org> <87mykde2ho.fsf@kobe.laptop> <20080720063746.GB21826@thought.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sunday 20 July 2008 08:37, Gary Kline wrote: > On Sun, Jul 20, 2008 at 05:03:15AM +0300, Giorgos Keramidas wrote: > > On Sun, 20 Jul 2008 03:44:07 +0300, Giorgos Keramidas <keramida@ceid.upatras.gr> wrote: > > > Now, if you want to merely "hack something quick and dirty", a short > > > Perl script can probably do regexp substitution similar to > > > > > > # > > > # WARNING: THIS HAS NOT BEEN TESTED :P > > > # > > > my $foo = <STDIN>; > > > $foo = s:(<[^>]+>[^<]*</[^>]+>):$1\n:ge; > > > print "$foo"; > > > > > > but you shouldn't trust the output of such a quick hack too much. > > > > As I wrote in reply to the personal email, this was untested and a bit > > wrong in places, but now I've tried something like: > > > > $ echo '<hello>world</hello><hello>next world</hello>' | \ > > perl -e '$foo = <STDIN>; $foo =~ s:(<[^>]+>[^<]*</[^>]+>):$1\n:g; print > > "$foo";' > > > > and it does seem to sort of work. The output is: > > > > <hello>world</hello> > > <hello>next world</hello> > > > > Maybe that's good enough? They say `the perfect is the enemy of good > > enough', so if this works for your data set, it's probably ok to use it > > :-) > > > > Have fun, > > Giorgos > > Fun?! welll, but yes, anything that can save me from > hand-editing ~~70 files will be a riot;) I haven't tried it, but I suspect if the simple approach fails, HTML::Tidy may well have an option which would help. It can be installed from CPAN or ports, where it is textproc/p5-HTML-Tidy. Jonathan
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200807201010.11568.jonathan%2Bfreebsd-questions>