From owner-freebsd-questions@FreeBSD.ORG Sun Nov 4 01:39:37 2007 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0662E16A46B for ; Sun, 4 Nov 2007 01:39:37 +0000 (UTC) (envelope-from cpghost@cordula.ws) Received: from fw.farid-hajji.net (fw.farid-hajji.net [213.146.115.42]) by mx1.freebsd.org (Postfix) with ESMTP id 9FC3313C494 for ; Sun, 4 Nov 2007 01:39:35 +0000 (UTC) (envelope-from cpghost@cordula.ws) Received: from epia-2.farid-hajji.net (epia-2 [192.168.254.11]) by fw.farid-hajji.net (Postfix) with ESMTP id A7F33E04C1; Sun, 4 Nov 2007 02:39:16 +0100 (CET) Date: Sun, 4 Nov 2007 02:39:14 +0100 From: cpghost To: freebsd-questions@freebsd.org Message-ID: <20071104023914.3fabd2e7@epia-2.farid-hajji.net> In-Reply-To: <20071104003851.GA98655@thought.org> References: <20071104003851.GA98655@thought.org> Organization: Cordula's Web X-Mailer: Claws Mail 3.0.2 (GTK+ 2.12.1; i386-portbld-freebsd6.2) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: Gary Kline Subject: Re: pdf edit again. X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Nov 2007 01:39:37 -0000 On Sat, 3 Nov 2007 16:38:55 -0800 Gary Kline wrote: > A couple weeks ago I skimmed thru the postings on editing PDF > files. Wasn't entirely clear what the answer it because I > never thought I would need to edit a GUI file. I just found a book > from 1883 in pdf format. I would like a text/ASCII/ISO_8859-1 > version. Tried pfdtotext, but it doesn't work. Nutshell: is > there something I can use to edit/look-at this book and get > rid of whateveriit is that's causing pdftotext to fail. (sorry for > the grammar.... ) Old books in PDF are normally scanned bitmaps. There are no characters or whatever therein; just pixels (EPS files). If you want to convert that to ASCII, you'd need to extract the EPS files (use something like pdfimages from the xpdf port), turn them into some bitmap format, and run some kind of OCR software on that. It's a slow, unreliable, error-prone and painful process though. Good luck! -cpghost. -- Cordula's Web. http://www.cordula.ws/