From owner-freebsd-questions@FreeBSD.ORG Mon May 14 19:09:09 2007 Return-Path: X-Original-To: freebsd-questions@FreeBSD.ORG Delivered-To: freebsd-questions@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 12DB616A402 for ; Mon, 14 May 2007 19:09:09 +0000 (UTC) (envelope-from cswiger@mac.com) Received: from mail-out4.apple.com (mail-out4.apple.com [17.254.13.23]) by mx1.freebsd.org (Postfix) with ESMTP id F124C13C48A for ; Mon, 14 May 2007 19:09:08 +0000 (UTC) (envelope-from cswiger@mac.com) Received: from relay8.apple.com (relay8.apple.com [17.128.113.38]) by mail-out4.apple.com (Postfix) with ESMTP id D5A4212E447; Mon, 14 May 2007 12:09:08 -0700 (PDT) Received: from relay8.apple.com (unknown [127.0.0.1]) by relay8.apple.com (Symantec Mail Security) with ESMTP id C4E444058D; Mon, 14 May 2007 12:09:08 -0700 (PDT) X-AuditID: 11807126-a4431bb000004313-03-4648b3d4057a Received: from [17.214.13.96] (cswiger1.apple.com [17.214.13.96]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by relay8.apple.com (Apple SCV relay) with ESMTP id B4BF640024; Mon, 14 May 2007 12:09:08 -0700 (PDT) In-Reply-To: <20070512195437.GA92218@thought.org> References: <20070512195437.GA92218@thought.org> Mime-Version: 1.0 (Apple Message framework v752.2) Content-Type: text/plain; charset=US-ASCII; format=flowed Message-Id: <4604BD8D-A0D6-4895-AF93-92758632A992@mac.com> Content-Transfer-Encoding: 7bit From: Chuck Swiger Date: Mon, 14 May 2007 12:09:07 -0700 To: Gary Kline X-Mailer: Apple Mail (2.752.2) X-Brightmail-Tracker: AAAAAA== Cc: FreeBSD Mailing List Subject: Re: what's the easiest way to de-html-ize files? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 May 2007 19:09:09 -0000 On May 12, 2007, at 12:54 PM, Gary Kline wrote: > This is for those of us who appreciate ASCII or straight > ISO_8859-15 rather than marked up files. I have slapped together > a crude C program that does scotch (or *cleanse*) text of > and so on. Still... is there some standalone converter > that gets rids of markup more elegantly? Something where i > can say > > % cmd file_1.html ... file_N.html and output file_1.text ... > file_N.text? Perhaps: lynx -dump file1.html ... > file.text ...? -- -Chuck