From owner-freebsd-chat Thu Apr 11 5:30:45 2002 Delivered-To: freebsd-chat@freebsd.org Received: from mail.inka.de (quechua.inka.de [212.227.14.2]) by hub.freebsd.org (Postfix) with ESMTP id 1269D37B442 for ; Thu, 11 Apr 2002 05:30:36 -0700 (PDT) Received: from kemoauc.mips.inka.de (uucp@) by mail.inka.de with local-bsmtp id 16vdik-0001dQ-01; Thu, 11 Apr 2002 14:30:34 +0200 Received: from kemoauc.mips.inka.de (localhost [127.0.0.1]) by kemoauc.mips.inka.de (8.12.2/8.12.2) with ESMTP id g3BCB1cU038092 for ; Thu, 11 Apr 2002 14:11:01 +0200 (CEST) (envelope-from mailnull@localhost.mips.inka.de) Received: (from mailnull@localhost) by kemoauc.mips.inka.de (8.12.2/8.12.2/Submit) id g3BCB1xo038091 for freebsd-chat@freebsd.org; Thu, 11 Apr 2002 14:11:01 +0200 (CEST) (envelope-from mailnull) From: naddy@mips.inka.de (Christian Weisgerber) Subject: Re: what are these characters please? Date: Thu, 11 Apr 2002 12:11:00 +0000 (UTC) Message-ID: References: <3CB4FBFB.9D2AC7E0@mindspring.com> <20020411102024.3E6283F30@bast.unixathome.org> Originator: naddy@mips.inka.de (Christian Weisgerber) To: freebsd-chat@freebsd.org Sender: owner-freebsd-chat@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Dan Langille wrote: > Given that I'm trying to process the cvs-all messages into XML documents > (using the perl module XML::Writer which does not do any encoding beyond > characters such as >, <, etc), any suggestions as to how to deal with such > characters? I've been looking through cpan but I suspect I'm using the > wrong search criteria ("encoding"). Any clues? Well what encoding do your XML documents use? I guess your basic situation is that you are getting unknown characters in an unknown encoding. You then have to manually figure out what this is, e.g. you asked here and I'm telling you it's character U+00E4. You can now store this in your encoding of choice. BTW, if you're hazy how all this works (and it sure looks like it), I recommend you read "A Tutorial on Character Code Issues" http://www.cs.tut.fi/~jkorpela/chars.html This generally doesn't solve problems by itself, but it helps people to *understand* the problem. -- Christian "naddy" Weisgerber naddy@mips.inka.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-chat" in the body of the message