From owner-freebsd-questions@FreeBSD.ORG Sun Jul 20 08:09:51 2008 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0EE8A106564A for ; Sun, 20 Jul 2008 08:09:51 +0000 (UTC) (envelope-from jonathan+freebsd-questions@hst.org.za) Received: from hermes.hst.org.za (onix.hst.org.za [209.203.2.133]) by mx1.freebsd.org (Postfix) with ESMTP id 22CB28FC08 for ; Sun, 20 Jul 2008 08:09:49 +0000 (UTC) (envelope-from jonathan+freebsd-questions@hst.org.za) Received: from [10.1.11.1] ([10.1.11.1]) (authenticated bits=0) by hermes.hst.org.za (8.13.8/8.13.8) with ESMTP id m6K86HLs022810 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sun, 20 Jul 2008 10:06:18 +0200 (SAST) (envelope-from jonathan+freebsd-questions@hst.org.za) From: Jonathan McKeown To: freebsd-questions@freebsd.org Date: Sun, 20 Jul 2008 10:10:11 +0200 User-Agent: KMail/1.9.4 References: <20080720002345.GA9173@thought.org> <87mykde2ho.fsf@kobe.laptop> <20080720063746.GB21826@thought.org> In-Reply-To: <20080720063746.GB21826@thought.org> X-Face: $@VrUx^RHy/}yu]jKf/<4T%/d|F+$j-Ol2"2J$q+%OK1]&/G_S9(=?utf-8?q?HkaQ*=60!=3FYOK=3FY!=27M=60C=0A=09aP=5C9nVPF8Q=7DCilHH8l=3B=7E!4?= =?utf-8?q?2HK6=273lg4J=7Daz?=@1Dqqh:J]M^"YPn*2IWrZON$1+G?oX3@ =?utf-8?q?k=230=0A=0954XDRg=3DYn=5FF-etwot4U=24b?=dTS{i X-Spam-Score: -4.375 () ALL_TRUSTED,AWL,BAYES_00 X-Scanned-By: MIMEDefang 2.61 on 209.203.2.133 Subject: Re: How to divide up? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Jul 2008 08:09:51 -0000 On Sunday 20 July 2008 08:37, Gary Kline wrote: > On Sun, Jul 20, 2008 at 05:03:15AM +0300, Giorgos Keramidas wrote: > > On Sun, 20 Jul 2008 03:44:07 +0300, Giorgos Keramidas wrote: > > > Now, if you want to merely "hack something quick and dirty", a short > > > Perl script can probably do regexp substitution similar to > > > > > > # > > > # WARNING: THIS HAS NOT BEEN TESTED :P > > > # > > > my $foo = ; > > > $foo = s:(<[^>]+>[^<]*]+>):$1\n:ge; > > > print "$foo"; > > > > > > but you shouldn't trust the output of such a quick hack too much. > > > > As I wrote in reply to the personal email, this was untested and a bit > > wrong in places, but now I've tried something like: > > > > $ echo 'worldnext world' | \ > > perl -e '$foo = ; $foo =~ s:(<[^>]+>[^<]*]+>):$1\n:g; print > > "$foo";' > > > > and it does seem to sort of work. The output is: > > > > world > > next world > > > > Maybe that's good enough? They say `the perfect is the enemy of good > > enough', so if this works for your data set, it's probably ok to use it > > :-) > > > > Have fun, > > Giorgos > > Fun?! welll, but yes, anything that can save me from > hand-editing ~~70 files will be a riot;) I haven't tried it, but I suspect if the simple approach fails, HTML::Tidy may well have an option which would help. It can be installed from CPAN or ports, where it is textproc/p5-HTML-Tidy. Jonathan