Date: Wed, 3 Sep 2008 21:33:30 -0400 (EDT) From: vogelke+software@pobox.com (Karl Vogel) To: freebsd-questions@freebsd.org Subject: Re: script to assist ASCII text Message-ID: <20080904013330.B1E92B7BD@kev.msw.wpafb.af.mil> In-Reply-To: <1219723211.4994.165.camel@localhost> (message from Gary Kline on Mon, 25 Aug 2008 21:00:10 -0700)
next in thread | previous in thread | raw e-mail | index | archive | help
>> On Mon, 25 Aug 2008 21:00:10 -0700, >> Gary Kline <kline@thought.org> said: G> This had eluded me for years and it may not be possible, but here goes. G> I write using vi or, less frequently vim. Is there any sh script that G> would make sure that there were exactly one space ('\040') between words, G> and three spaces between sentences? My definition of "a sentence" is a G> string of words that ends in a period or question-mark, exclamation-mark, G> or ellipse ("... . || ... ? || ... !) Also, any dash "--" could not have G> any whitespace around it. I like a similar setup -- one space between words, sentences ending with a period followed by two spaces. The GNU version of "fmt" handles this pretty well. Here's the first part of your message, formatted to 50-character-wide lines, with the type of spacing that drives me nuts: me% cat -n msg 1 This had eluded me for years and it may not be 2 possible, but here goes. I write using vi or, 3 less frequently vim. Is there any sh script that 4 would make sure that there were exactly one 5 space ('\040') between words, and three spaces 6 between sentences? My definition of "a sentence" 7 is a string of words that ends in a period or 8 question-mark, exclamation-mark, or ellipse. Putting one word on each line and then letting GNU fmt decide on sentence-handling does almost exactly what you want: me% gfmt -1 msg | gfmt -50 | cat -n 1 This had eluded me for years and it may not be 2 possible, but here goes. I write using vi or, 3 less frequently vim. Is there any sh script 4 that would make sure that there were exactly one 5 space ('\040') between words, and three spaces 6 between sentences? My definition of "a sentence" 7 is a string of words that ends in a period or 8 question-mark, exclamation-mark, or ellipse. Here's a script I use as a driver for GNU fmt. It looks for an optional environment variable FMTWIDTH to decide how long each line should be. This comes in handy if I call vi/vim from within a script: #!/bin/sh # driver for fmt. case "$FMTWIDTH" in "") opt= ;; *) opt="-$FMTWIDTH" ;; esac case "$1" in -*) opt= ;; *) ;; esac exec /usr/local/bin/gfmt $opt ${1+"$@"} Here's an alias I use for quickly reformatting a section of text in vim. I mark where to start using 'a', then move down to the end of the section and hit 'v': jmbk:'a,.!fmt -1|fmt<CR>'b A similar alias will reformat whatever paragraph I'm in, with no need for marks: }jmbk{ma}:'a,.!fmt -1|fmt<CR>'b The script below helps me clean up a file or message after running fmt, which makes strings like "U.S.A." look like the end of a sentence even when they're not. This should give you some ideas. -- Karl Vogel I don't speak for the USAF or my company Panda Mating Fails; Veterinarian Takes Over --actual news headline, 1997 --------------------------------------------------------------------------- #!/usr/bin/perl # # $Id: cm,v 1.3 2008/08/17 20:25:49 vogelke Exp $ # $Source: /home/vogelke/bin/RCS/cm,v $ # # cm: clean mail message while (<>) { s/Jan\. /Jan /g; s/Feb\. /Feb /g; s/Aug\. /Aug /g; s/Sept\. /Sept /g; s/Oct\. /Oct /g; s/Nov\. /Nov /g; s/Dec\. /Dec /g; s/Mr\. /Mr. /g; s/Mrs\. /Mrs. /g; s/Ms\. /Ms. /g; s/Dr\. /Dr. /g; s/Sen\. /Senator /g; s/Rep\. /Representative /g; s/U\.S\.A\. /USA /g; s/U\.S\. /US /g; s/D\.C\. /DC /g; s/U\.N\. /UN /g; s/B\.S\. /BS /g; s/M\.B\.A\. /MBA /g; s/ ([A-Z]\.) / $1 /g; s/''/\"/g; s/``/\"/g; s/\342\200\231/'/g; # These come from saving Firefox pages s/\342\200\234/"/g; s/\342\200\235/"/g; print; } exit(0);
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080904013330.B1E92B7BD>