Date: Tue, 16 Feb 1999 12:15:00 +1100 From: Sue Blake <sue@welearn.com.au> To: Dan Nelson <dnelson@emsphone.com> Cc: Greg Lehey <grog@lemis.com>, rick hamell <hamellr@dsinw.com>, freebsd-questions@FreeBSD.ORG Subject: Re: cleaning a text file Message-ID: <19990216121500.33635@welearn.com.au> In-Reply-To: <19990215185722.A21817@dan.emsphone.com>; from Dan Nelson on Mon, Feb 15, 1999 at 06:57:22PM -0600 References: <19990215201056.19929@welearn.com.au> <Pine.BSF.3.91.990215010943.20451F-100000@dsinw.com> <19990216095232.J2207@lemis.com> <19990216103740.60271@welearn.com.au> <19990215185722.A21817@dan.emsphone.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Feb 15, 1999 at 06:57:22PM -0600, Dan Nelson wrote:
> In the last episode (Feb 16), Sue Blake said:
> > The problem is that I don't know which funny characters exist in the
> > file, if any. I want to find out what they are, so I can search for
> > them and eyeball them before killing them.
>
> How about something like
>
> grep "^[ -~]" file.txt
>
> That will print any lines that have characters outside the standard
> printable ascii set. Then you can look at the oddball letters and
> figure out appropriate replacement characters.
Hey, yeah, that'd be a great first check, enough to give it a clean
bill of health or deal with a few characters that are easily spotted.
Don Read sent this one too:
fold -w1 yourfile.txt |sort |uniq | grep -v "[A-Za-z0-9]"
which seems to do the trick. It's very slow, but it works.
With either or both of these, it's just a matter of finding the
character among what's pulled out, determining its character number,
checking its context in the file and making a decision about
substitution, then running tr or doing a replace with a text editor.
For the most common case, where there is nothing wrong with the file,
it's possible to confirm that the file is OK as is.
Reidar Bratsberg mentioned a utility called pep which might be good,
but so far I haven't been able to randomly press the right buttons to
make it compile. Experiments continue.
--
Regards,
-*Sue*-
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19990216121500.33635>
