From owner-freebsd-questions@FreeBSD.ORG Tue Dec 2 02:42:50 2008 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A7DB31065673 for ; Tue, 2 Dec 2008 02:42:50 +0000 (UTC) (envelope-from on@cs.ait.ac.th) Received: from mail.cs.ait.ac.th (mail.cs.ait.ac.th [192.41.170.16]) by mx1.freebsd.org (Postfix) with ESMTP id 3745D8FC16 for ; Tue, 2 Dec 2008 02:42:49 +0000 (UTC) (envelope-from on@cs.ait.ac.th) Received: from banyan.cs.ait.ac.th (banyan.cs.ait.ac.th [192.41.170.5]) by mail.cs.ait.ac.th (8.13.1/8.13.1) with ESMTP id mB22fTTP013776 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Tue, 2 Dec 2008 09:41:29 +0700 (ICT) (envelope-from on@banyan.cs.ait.ac.th) Received: (from on@localhost) by banyan.cs.ait.ac.th (8.14.2/8.12.11) id mB22gjHS074260; Tue, 2 Dec 2008 09:42:45 +0700 (ICT) Date: Tue, 2 Dec 2008 09:42:45 +0700 (ICT) Message-Id: <200812020242.mB22gjHS074260@banyan.cs.ait.ac.th> From: Olivier Nicole To: freebsd-questions@freebsd.org In-reply-to: <18740.36349.523718.591189@jerusalem.litteratus.org> (message from Robert Huff on Mon, 1 Dec 2008 20:23:09 -0500) References: <20081201231440.GA30682@thought.org> <20081202010730.GA15970@slackbox.xs4all.nl> <18740.36349.523718.591189@jerusalem.litteratus.org> X-Virus-Scanned: on CSIM by amavisd-milter (http://www.amavis.org/) Subject: Re: any way to turn a pdf file into something OCR-able? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Dec 2008 02:42:50 -0000 > > 1) Some PDFs are just wrappers around JPEG images. In this case > > there is no text for pdftotext to convert => epic fail. > > In this case "convert" from the ImageMagick port will get you a > series of .jpg/.gif/.. Read the manual carefully before > attempting; also note this can be a slow process. pdfimages (from ports graphics/xpdf) can also do that, maybe at a lesser cost. Bests, Olivier