From owner-freebsd-chat Thu Apr 11 5:59: 6 2002 Delivered-To: freebsd-chat@freebsd.org Received: from bast.unixathome.org (bast.unixathome.org [216.187.105.150]) by hub.freebsd.org (Postfix) with ESMTP id 48ED037B483 for ; Thu, 11 Apr 2002 05:58:43 -0700 (PDT) Received: from wocker (wocker.unixathome.org [192.168.0.99]) by bast.unixathome.org (Postfix) with ESMTP id 4448B3F30; Thu, 11 Apr 2002 08:59:36 -0400 (EDT) From: "Dan Langille" Organization: DVL Software Limited To: Giorgos Keramidas Date: Thu, 11 Apr 2002 08:58:41 -0400 MIME-Version: 1.0 Subject: Re: what are these characters please? Reply-To: dan@langille.org Cc: chat@FreeBSD.ORG In-reply-to: <20020411124908.GD39629@hades.hell.gr> References: <20020411113858.E48BB3F30@bast.unixathome.org> X-mailer: Pegasus Mail for Windows (v4.01) Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7BIT Content-description: Mail message body Message-Id: <20020411125936.4448B3F30@bast.unixathome.org> Sender: owner-freebsd-chat@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On 11 Apr 2002 at 15:49, Giorgos Keramidas wrote: > On 2002-04-11 07:38, Dan Langille wrote: > > And line 14 is: > > > > [Submitted by: Ville SkyttESC,AdESC(B > > <ville.skytta@iki.fi>] > > > > I think my goal here is remove all non-ISO-8859-1 characters from the > > incoming cvs-all message. I've been searching newsgroups (comp.lang.perl > > and comp.text.xml) trying to find a simple solution. > > You can probably get away with using col(1) and proper environment > settings to filter the CVS logs: > > $ env LANG=en_US LC_ALL=en_US.ISO8859-1 col -b > S ren Schmidt > S ren Schmidt > > $ env LANG=C LC_ALL=C col -b > S ren Schmidt > Sren Schmidt > > The name of S ren includes "o/" and it is a valid ISO8859-1 character. > col(1) can understand and filter correctly based on this fact, but I'm not > sure if it can strip all the ANSI escape codes that you were having trouble > with. It's just an idea. Might work, or might not... > > I don't know how you are using the CVS logs, so you'll have to set up > some Perl pipe to do the work yourself. Perhaps something like: > > close(STDIN); > open(STDIN, "env LANG=en_US LC_ALL=en_US.ISO8859-1 col -b |"); The website (http://www.FreshPorts.org/) takes incoming messages from the cvs-all mailing list and uses procmail to dump the raw message to disk. Then a daemon takes over and processes the file using perl. I don't know if what you suggested will apply to this situation. Thank you. -- Dan Langille The FreeBSD Diary - http://freebsddiary.org/ - practical examples To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-chat" in the body of the message