Date: 20 Feb 1999 01:38:49 +0100 From: Kai.Grossjohann@CS.Uni-Dortmund.DE To: Sue Blake <sue@welearn.com.au> Cc: Mark Ovens <marko@uk.radan.com>, questions@FreeBSD.ORG Subject: Re: cleaning a text file Message-ID: <864sohkixy.fsf@slowfox.frob.org> In-Reply-To: Sue Blake's message of "Tue, 16 Feb 1999 11:49:59 %2B1100" References: <19990215201056.19929@welearn.com.au> <Pine.BSF.3.91.990215010943.20451F-100000@dsinw.com> <19990216095232.J2207@lemis.com> <19990216103740.60271@welearn.com.au> <19990216002703.A337@localhost> <19990216114959.08931@welearn.com.au>
next in thread | previous in thread | raw e-mail | index | archive | help
Sue Blake <sue@welearn.com.au> writes: > On Tue, Feb 16, 1999 at 12:27:03AM +0000, Mark Ovens wrote: > > > > First you need to identify the offending characters. > > Indeed. That is my sole problem. Well, search forward for the following regex: [^a-z0-9A-Z_+= \t\r\n-] If you find a character that's ok, add it to the list. After all, there are only 255 characters, and some of them will be bad. So you won't have to add characters often. Or are you saying you're looking at Japanese or Chinese text with multibyte characters? Then, you're screwed. kai -- I like _b_o_t_h kinds of music. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?864sohkixy.fsf>