From owner-freebsd-questions@FreeBSD.ORG Sun Oct 14 00:29:20 2012 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 620A48BA for ; Sun, 14 Oct 2012 00:29:20 +0000 (UTC) (envelope-from kline@thought.org) Received: from p3plsmtpa06-06.prod.phx3.secureserver.net (p3plsmtpa06-06.prod.phx3.secureserver.net [173.201.192.107]) by mx1.freebsd.org (Postfix) with ESMTP id 38EAC8FC08 for ; Sun, 14 Oct 2012 00:29:20 +0000 (UTC) Received: from ethic.thought.org ([209.180.213.209]) by p3plsmtpa06-06.prod.phx3.secureserver.net with id AoSi1k00C4XeM0101oSkGH; Sat, 13 Oct 2012 17:26:44 -0700 Date: Sat, 13 Oct 2012 17:26:42 -0700 From: Gary Kline To: Polytropon Subject: Re: editing pdf files Message-ID: <20121014002642.GA26447@ethic.thought.org> References: <5074A6B9.8040209@dreamchaser.org> <5078641D.4050905@passap.ru> <20121012234628.GA11112@ethic.thought.org> <20121013131907.c666bfc2.freebsd@edvax.de> <20121013204701.GE14155@ethic.thought.org> <20121013231536.c703bc21.freebsd@edvax.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20121013231536.c703bc21.freebsd@edvax.de> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: FreeBSD Mailing List X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Oct 2012 00:29:20 -0000 On Sat, Oct 13, 2012 at 11:15:36PM +0200, Polytropon wrote: > On Sat, 13 Oct 2012 13:47:01 -0700, Gary Kline wrote: > > On Sat, Oct 13, 2012 at 01:19:07PM +0200, Polytropon wrote: > > > On Fri, 12 Oct 2012 16:46:28 -0700, Gary Kline wrote: > > > > > > The disassembling can be done with > > > > > > % pdfimages source.pdf . > > > > > > Then the files can be edited whatever tool you like, e. g. Gimp. > > > They often come out in PBM format. > > > > > > > > > A qstn I should have asked last time. this book is a history or > > bio of richland county, ohio:: in type, it's like 650 or more > > pages. SO: Is pdfimages going to spit of 6t50 files? as noted > > in last email, only a couple of these images are of any interest > > Depends on what actually _is_ in the PDF file. If every page is > represented as a picture, 650 pictures will be created. If it > contains text _and_ images, the images will be output, if will > _only_ output the images, with no real realtion to where they > have been placed in the text. As suggested by the name "pdfimages" > it takes the images from the PDF file. :-) > > The easiest way to check for possible text is to install xpdf > which brings the binary "pdftotext" (if I remember correctly that > this tool is in _that_ package). You can then use it like this: > > % pdftotext source.pdf > > It will create "source.txt" with all actual text (but of course > without _any_ formatting except line breaks and ^L page breaks), > including page numbers. But hey, it's pure ASCII text suitable > for further processing. :-) > > Run "pdftotext" without parameters for a short summary of its > parameters; "man pdftotext" is also provided. > Well, then my original instincts were right. I ran the pdftotext and nothing but the page numbers were there. rats. oh-well, at least I can type in byhhand what I want:) > > -- > Polytropon > Magdeburg, Germany > Happy FreeBSD user since 4.0 > Andra moi ennepe, Mousa, ... -- Gary Kline kline@thought.org http://www.thought.org Public Service Unix Twenty-six years of service to the Unix community.