Date: Mon, 3 Jun 1996 23:23:05 -0700 (PDT) From: Bryan Ogawa at Work <bogawa@netvoyage.net> To: Sean Kelly <kelly@fsl.noaa.gov> Cc: tcg@ime.net, questions@freebsd.org Subject: Re: Postscript conversion Message-ID: <Pine.BSI.3.93.960603231421.1943E-100000@digital.netvoyage.net> In-Reply-To: <199606040410.EAA06403@gatekeeper.fsl.noaa.gov>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 3 Jun 1996, Sean Kelly wrote: > >>>>> "Gary" == Gary Chrysler <tcg@ime.net> writes: > > Gary> So is there any other way I can convert postscript files to > Gary> ascii? I would really like to read the socks manual! > > Converting PostScript to ASCII is three factors of magnitude (tm) > harder than the halting problem! And it's NP-complete, too! :-) :) :) :) > Seriously, in general it can't be done. After all, if you've got > PostScript code that draws out each letter through a thousand or so > moveto/lineto/arcto sequences, on paper it may look fine, but there's > little hope of extracting just the text out of that. On the other hand, there exists a (supposedly) much improved ps2ascii converter out there, by the name of pstotext, which came out of DEC's Virtual Paper project. I can speak from experience that the previous ps to plain text utilities were spectacularly bad, and as the postscript FAQ explains, it's hard to get it right. It's at: http://www.research.digital.com/SRC/virtualpaper/pstotext.html Although I haven't tried pstotext out, I find the following quote from a message a positive sign: We've tested pstotext on millions of lines of PostScript, including files generated by several versions of drivers from each of Windows, Macintosh, and dvips (TeX). It deals successfully with a wide variety of encoding vectors, and it re-assembles words that have been broken up for pair-kerning (it doesn't re-assemble words that have been hyphenated, though). It also works (though a little less reliably) on Acrobat PDF files. You'll need Aladdin postscript 3.33 / 3.51 or later to run it, apparently. The reason I mention this here (besides trying to be helpful :) ) is that I stumbled across this in a search for a better ps2ascii converter (as I mentioned, the ones I had seen before were spectacularly poor at it). Since it's not mentioned in any of the PS faqs I know of, and not easily findable from search engines (even altavista, which is surprising since it's digital), I thought I'd mention it and let other people know about it, if only to find out it's still insufficient for the job. :) bryan > > Knowing what produced the PostScript code can be a big help, though. > Some programs exist that recognize the PostScript produced by various > document packages and wade its way through the font changes and kerns, > revealing plain old text. > > And yes, Ghostscript is your friend. :-) > > Seriously, your best bet is to install Ghostscript. Right, I hear you > ... you don't wanna install X windows ... after all, Marcus J Ranum of > DEC said: > > If the designers of X Windows built cars, there would be no > fewer than five steering whells hidden about the cockpit, none > of which followed the same principles---but you'd be able to > shift gears with your car stereo. Useful feature, that. > > So, you'll be happy to note that Ghostscript doesn't need X windows! > Just avoid the copy that's in the ports collection (which I'm assuming > is configured for X by default) and build and install it yourself. In > fact, I've made sure that ``The Professor'' knows that it worked > out-of-the-box on FreeBSD ... that was back in version 3.33, and I'm > sure it's still true today in version 3.53. > > So, grab these files: > > ftp://ftp.cs.wisc.edu/pub/ghost/aladdin/ghostscript-3.53.tar.gz > ftp://ftp.cs.wisc.edu/pub/ghost/aladdin/ghostscript-3.53jpeg.tar.gz > > And in the makefile, explictly leave OUT the X windows stuff! The > README and make.doc files will certainly provide you with more hints. > > Once you've got it built and installed, you'll have a new script to > play with: /usr/local/bin/ps2ascii, which uses gs to extract text out > of PS files. > > GOOD LUCK! > > -- > Sean Kelly > NOAA Forecast Systems Laboratory kelly@fsl.noaa.gov > Boulder Colorado USA http://www-sdd.fsl.noaa.gov/~kelly/ > Bryan K. Ogawa Questions or Problems with NetVoyage? help@netvoyage.net Check out the NetVoyage HelpWeb at.. <URL: http://www.netvoyage.net/~help/>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSI.3.93.960603231421.1943E-100000>