Date: Sat, 13 Oct 2012 17:26:42 -0700 From: Gary Kline <kline@thought.org> To: Polytropon <freebsd@edvax.de> Cc: FreeBSD Mailing List <freebsd-questions@freebsd.org> Subject: Re: editing pdf files Message-ID: <20121014002642.GA26447@ethic.thought.org> In-Reply-To: <20121013231536.c703bc21.freebsd@edvax.de> References: <5074A6B9.8040209@dreamchaser.org> <5078641D.4050905@passap.ru> <20121012234628.GA11112@ethic.thought.org> <20121013131907.c666bfc2.freebsd@edvax.de> <20121013204701.GE14155@ethic.thought.org> <20121013231536.c703bc21.freebsd@edvax.de>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Oct 13, 2012 at 11:15:36PM +0200, Polytropon wrote: > On Sat, 13 Oct 2012 13:47:01 -0700, Gary Kline wrote: > > On Sat, Oct 13, 2012 at 01:19:07PM +0200, Polytropon wrote: > > > On Fri, 12 Oct 2012 16:46:28 -0700, Gary Kline wrote: > > > > > > The disassembling can be done with > > > > > > % pdfimages source.pdf . > > > > > > Then the files can be edited whatever tool you like, e. g. Gimp. > > > They often come out in PBM format. > > > > > > > > > A qstn I should have asked last time. this book is a history or > > bio of richland county, ohio:: in type, it's like 650 or more > > pages. SO: Is pdfimages going to spit of 6t50 files? as noted > > in last email, only a couple of these images are of any interest > > Depends on what actually _is_ in the PDF file. If every page is > represented as a picture, 650 pictures will be created. If it > contains text _and_ images, the images will be output, if will > _only_ output the images, with no real realtion to where they > have been placed in the text. As suggested by the name "pdfimages" > it takes the images from the PDF file. :-) > > The easiest way to check for possible text is to install xpdf > which brings the binary "pdftotext" (if I remember correctly that > this tool is in _that_ package). You can then use it like this: > > % pdftotext source.pdf > > It will create "source.txt" with all actual text (but of course > without _any_ formatting except line breaks and ^L page breaks), > including page numbers. But hey, it's pure ASCII text suitable > for further processing. :-) > > Run "pdftotext" without parameters for a short summary of its > parameters; "man pdftotext" is also provided. > Well, then my original instincts were right. I ran the pdftotext <file.pdf> and nothing but the page numbers were there. rats. oh-well, at least I can type in byhhand what I want:) > > -- > Polytropon > Magdeburg, Germany > Happy FreeBSD user since 4.0 > Andra moi ennepe, Mousa, ... -- Gary Kline kline@thought.org http://www.thought.org Public Service Unix Twenty-six years of service to the Unix community.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20121014002642.GA26447>