Date: Sat, 12 May 2007 14:34:52 -0700 From: Gary Kline <kline@tao.thought.org> To: Chuck Swiger <cswiger@mac.com> Cc: Gary Kline <kline@tao.thought.org>, FreeBSD Mailing List <freebsd-questions@FreeBSD.ORG> Subject: Re: what's the easiest way to de-html-ize files? Message-ID: <20070512213452.GA92514@thought.org> In-Reply-To: <4604BD8D-A0D6-4895-AF93-92758632A992@mac.com> References: <20070512195437.GA92218@thought.org> <4604BD8D-A0D6-4895-AF93-92758632A992@mac.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, May 14, 2007 at 12:09:07PM -0700, Chuck Swiger wrote: > On May 12, 2007, at 12:54 PM, Gary Kline wrote: > >This is for those of us who appreciate ASCII or straight > > ISO_8859-15 rather than marked up files. I have slapped together > > a crude C program that does scotch (or *cleanse*) text of > > <B></B> and so on. Still... is there some standalone converter > > that gets rids of markup more elegantly? Something where i > > can say > > > > % cmd file_1.html ... file_N.html and output file_1.text ... > > file_N.text? > > Perhaps: > > lynx -dump file1.html ... > file.text > > ...? Hm, maybe Ineed Bill Campbell's -force_html switch. Yes, seems that way. USing just -dump got most of them, but using the -force_html caught all. Need to script something to reformat, but the worst of it's done! thanks, guys, gary > > -- > -Chuck > -- Gary Kline kline@thought.org www.thought.org Public Service Unix
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070512213452.GA92514>