Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 12 May 2007 14:34:52 -0700
From:      Gary Kline <kline@tao.thought.org>
To:        Chuck Swiger <cswiger@mac.com>
Cc:        Gary Kline <kline@tao.thought.org>, FreeBSD Mailing List <freebsd-questions@FreeBSD.ORG>
Subject:   Re: what's the easiest way to de-html-ize files?
Message-ID:  <20070512213452.GA92514@thought.org>
In-Reply-To: <4604BD8D-A0D6-4895-AF93-92758632A992@mac.com>
References:  <20070512195437.GA92218@thought.org> <4604BD8D-A0D6-4895-AF93-92758632A992@mac.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, May 14, 2007 at 12:09:07PM -0700, Chuck Swiger wrote:
> On May 12, 2007, at 12:54 PM, Gary Kline wrote:
> >This is for those of us who appreciate ASCII or straight
> >	ISO_8859-15 rather than marked up files.  I have slapped together
> >	a crude C program that does scotch (or *cleanse*) text of
> >	<B></B> and so on.   Still... is there some standalone converter
> >	that gets rids of markup more elegantly?   Something where i
> >	can say
> >
> >	% cmd file_1.html ... file_N.html and output file_1.text ...
> >	file_N.text?
> 
> Perhaps:
> 
>   lynx -dump file1.html ... > file.text
> 
> ...?


	Hm, maybe Ineed Bill Campbell's -force_html switch.  


	Yes, seems that way.  USing just -dump got most of them, but
	using the -force_html caught all.  Need to script something to
	reformat, but the worst of it's done!

	thanks, guys,

	gary


> 
> -- 
> -Chuck
> 

-- 
  Gary Kline  kline@thought.org   www.thought.org  Public Service Unix




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070512213452.GA92514>