Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 2 Dec 2008 09:42:45 +0700 (ICT)
From:      Olivier Nicole <on@cs.ait.ac.th>
To:        freebsd-questions@freebsd.org
Subject:   Re: any way to turn a pdf file into something OCR-able?
Message-ID:  <200812020242.mB22gjHS074260@banyan.cs.ait.ac.th>
In-Reply-To: <18740.36349.523718.591189@jerusalem.litteratus.org> (message from Robert Huff on Mon, 1 Dec 2008 20:23:09 -0500)
References:  <20081201231440.GA30682@thought.org> <20081202010730.GA15970@slackbox.xs4all.nl> <18740.36349.523718.591189@jerusalem.litteratus.org>

next in thread | previous in thread | raw e-mail | index | archive | help
> >  1) Some PDFs are just wrappers around JPEG images. In this case
> >  there is no text for pdftotext to convert => epic fail.
> 
> 	In this case "convert" from the ImageMagick port will get you a
> series of .jpg/.gif/.<whatever>.  Read the manual carefully before
> attempting; also note this can be a slow process.

pdfimages (from ports graphics/xpdf) can also do that, maybe at a
lesser cost.

Bests,

Olivier



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200812020242.mB22gjHS074260>