Date: Fri, 01 Jan 1999 19:51:43 +0000 From: Mark Ovens <marko@uk.radan.com> To: Jerry Preeper <preeper@cts.com> Cc: freebsd-questions@FreeBSD.ORG Subject: Re: replace non-ascii characters Message-ID: <368D274F.7A57D11A@uk.radan.com> References: <3.0.5.32.19990101042759.008a1a70@crash.cts.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Jerry Preeper wrote: > > I know this isn't really a freebsd question, but I'm not sure where else to > ask. I'm trying to write a small shell script that replaces non-ascii > characters with the html equivalent in a file and just can't seem to figure > how to identify the non-ascii characters. > > for example, I have written a small shell script that takes a file name as > input to replace them using sed. Here is the script. > > #!/bin/sh > for file in $* > do > sed -n "s/\\0x80/\Ç\;/g" ${file} > sed -n "s/\\0x81/\ü\;/g" ${file} > ..... bunches more > done > > The problem is the search part isn't finding the special character. I have > tried cutting and pasting the special character directly into the script as > well, but it doesn't seem to work either. > > Does anyone have any ideas on how to accomplish. > I've found this problem before. As a suggestion try #!/bin/sh for file in $* do cp ${file} /tmp/${file} awk '{gsub("\x80", "\\Ç"); \ gsub("\x81", "\\&Cuuml"); \ ...more of the same print}' < /tmp/${file} > ${file} rm /tmp/${file} done ``gsub()'' replaces all occurrences in a line. Note that ``&'' needs escaping with ``\\''. I'm not sure if there is a limit to the length of line that awk can process. Perl may well provide the best solution though. HTH, Happy New Year > Thanks in advance. > > Jerry > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-questions" in the body of the message -- Trust the computer industry to shorten Year 2000 to Y2K. It was this thinking that caused the problem in the first place. Mark Ovens, CNC Applications Engineer, Radan Computational Ltd Sheet Metal CAD/CAM Solutions mailto:marko@uk.radan.com http://www.radan.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?368D274F.7A57D11A>