Date: Sat, 13 Oct 2012 23:15:36 +0200 From: Polytropon <freebsd@edvax.de> To: Gary Kline <kline@thought.org> Cc: FreeBSD Mailing List <freebsd-questions@freebsd.org> Subject: Re: editing pdf files Message-ID: <20121013231536.c703bc21.freebsd@edvax.de> In-Reply-To: <20121013204701.GE14155@ethic.thought.org> References: <5074A6B9.8040209@dreamchaser.org> <5078641D.4050905@passap.ru> <20121012234628.GA11112@ethic.thought.org> <20121013131907.c666bfc2.freebsd@edvax.de> <20121013204701.GE14155@ethic.thought.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 13 Oct 2012 13:47:01 -0700, Gary Kline wrote: > On Sat, Oct 13, 2012 at 01:19:07PM +0200, Polytropon wrote: > > On Fri, 12 Oct 2012 16:46:28 -0700, Gary Kline wrote: > > > > The disassembling can be done with > > > > % pdfimages source.pdf . > > > > Then the files can be edited whatever tool you like, e. g. Gimp. > > They often come out in PBM format. > > > > > A qstn I should have asked last time. this book is a history or > bio of richland county, ohio:: in type, it's like 650 or more > pages. SO: Is pdfimages going to spit of 6t50 files? as noted > in last email, only a couple of these images are of any interest Depends on what actually _is_ in the PDF file. If every page is represented as a picture, 650 pictures will be created. If it contains text _and_ images, the images will be output, if will _only_ output the images, with no real realtion to where they have been placed in the text. As suggested by the name "pdfimages" it takes the images from the PDF file. :-) The easiest way to check for possible text is to install xpdf which brings the binary "pdftotext" (if I remember correctly that this tool is in _that_ package). You can then use it like this: % pdftotext source.pdf It will create "source.txt" with all actual text (but of course without _any_ formatting except line breaks and ^L page breaks), including page numbers. But hey, it's pure ASCII text suitable for further processing. :-) Run "pdftotext" without parameters for a short summary of its parameters; "man pdftotext" is also provided. -- Polytropon Magdeburg, Germany Happy FreeBSD user since 4.0 Andra moi ennepe, Mousa, ...
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20121013231536.c703bc21.freebsd>