Date: Thu, 11 Apr 2002 15:49:08 +0300 From: Giorgos Keramidas <keramida@ceid.upatras.gr> To: Dan Langille <dan@langille.org> Cc: Terry Lambert <tlambert2@mindspring.com>, chat@FreeBSD.ORG Subject: Re: what are these characters please? Message-ID: <20020411124908.GD39629@hades.hell.gr> In-Reply-To: <20020411113858.E48BB3F30@bast.unixathome.org> References: <3CB571D6.2C10B9AA@mindspring.com> <20020411113858.E48BB3F30@bast.unixathome.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2002-04-11 07:38, Dan Langille wrote: > And line 14 is: > > [Submitted by: Ville SkyttESC,AdESC(B <ville.skytta@iki.fi>] > > I think my goal here is remove all non-ISO-8859-1 characters from the > incoming cvs-all message. I've been searching newsgroups (comp.lang.perl > and comp.text.xml) trying to find a simple solution. You can probably get away with using col(1) and proper environment settings to filter the CVS logs: $ env LANG=en_US LC_ALL=en_US.ISO8859-1 col -b Søren Schmidt Søren Schmidt $ env LANG=C LC_ALL=C col -b Søren Schmidt Sren Schmidt The name of Søren includes "o/" and it is a valid ISO8859-1 character. col(1) can understand and filter correctly based on this fact, but I'm not sure if it can strip all the ANSI escape codes that you were having trouble with. It's just an idea. Might work, or might not... I don't know how you are using the CVS logs, so you'll have to set up some Perl pipe to do the work yourself. Perhaps something like: close(STDIN); open(STDIN, "env LANG=en_US LC_ALL=en_US.ISO8859-1 col -b |"); Giorgos Keramidas FreeBSD Documentation Project keramida@{freebsd.org,ceid.upatras.gr} http://www.FreeBSD.org/docproj/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-chat" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020411124908.GD39629>