Date: Tue, 2 Dec 2008 09:42:45 +0700 (ICT) From: Olivier Nicole <on@cs.ait.ac.th> To: freebsd-questions@freebsd.org Subject: Re: any way to turn a pdf file into something OCR-able? Message-ID: <200812020242.mB22gjHS074260@banyan.cs.ait.ac.th> In-Reply-To: <18740.36349.523718.591189@jerusalem.litteratus.org> (message from Robert Huff on Mon, 1 Dec 2008 20:23:09 -0500) References: <20081201231440.GA30682@thought.org> <20081202010730.GA15970@slackbox.xs4all.nl> <18740.36349.523718.591189@jerusalem.litteratus.org>
next in thread | previous in thread | raw e-mail | index | archive | help
> > 1) Some PDFs are just wrappers around JPEG images. In this case > > there is no text for pdftotext to convert => epic fail. > > In this case "convert" from the ImageMagick port will get you a > series of .jpg/.gif/.<whatever>. Read the manual carefully before > attempting; also note this can be a slow process. pdfimages (from ports graphics/xpdf) can also do that, maybe at a lesser cost. Bests, Olivier
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200812020242.mB22gjHS074260>