From owner-freebsd-questions@FreeBSD.ORG Mon May 14 20:43:07 2007 Return-Path: X-Original-To: freebsd-questions@FreeBSD.ORG Delivered-To: freebsd-questions@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id DA27616A403 for ; Mon, 14 May 2007 20:43:07 +0000 (UTC) (envelope-from kline@tao.thought.org) Received: from tao.thought.org (dsl231-043-140.sea1.dsl.speakeasy.net [216.231.43.140]) by mx1.freebsd.org (Postfix) with ESMTP id 8281A13C44B for ; Mon, 14 May 2007 20:43:07 +0000 (UTC) (envelope-from kline@tao.thought.org) Received: from tao.thought.org (localhost [127.0.0.1]) by tao.thought.org (8.13.8/8.13.1) with ESMTP id l4CLYr96092910; Sat, 12 May 2007 14:34:53 -0700 (PDT) (envelope-from kline@tao.thought.org) Received: (from kline@localhost) by tao.thought.org (8.13.8/8.13.1/Submit) id l4CLYqAl092909; Sat, 12 May 2007 14:34:52 -0700 (PDT) (envelope-from kline) Date: Sat, 12 May 2007 14:34:52 -0700 From: Gary Kline To: Chuck Swiger Message-ID: <20070512213452.GA92514@thought.org> References: <20070512195437.GA92218@thought.org> <4604BD8D-A0D6-4895-AF93-92758632A992@mac.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4604BD8D-A0D6-4895-AF93-92758632A992@mac.com> User-Agent: Mutt/1.4.2.2i X-Organization: Thought Unlimited. Public service Unix since 1986. X-Of_Interest: Observing twenty years of service to the Unix community Cc: Gary Kline , FreeBSD Mailing List Subject: Re: what's the easiest way to de-html-ize files? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 May 2007 20:43:07 -0000 On Mon, May 14, 2007 at 12:09:07PM -0700, Chuck Swiger wrote: > On May 12, 2007, at 12:54 PM, Gary Kline wrote: > >This is for those of us who appreciate ASCII or straight > > ISO_8859-15 rather than marked up files. I have slapped together > > a crude C program that does scotch (or *cleanse*) text of > > and so on. Still... is there some standalone converter > > that gets rids of markup more elegantly? Something where i > > can say > > > > % cmd file_1.html ... file_N.html and output file_1.text ... > > file_N.text? > > Perhaps: > > lynx -dump file1.html ... > file.text > > ...? Hm, maybe Ineed Bill Campbell's -force_html switch. Yes, seems that way. USing just -dump got most of them, but using the -force_html caught all. Need to script something to reformat, but the worst of it's done! thanks, guys, gary > > -- > -Chuck > -- Gary Kline kline@thought.org www.thought.org Public Service Unix