Date: Fri, 12 Apr 2002 01:39:05 -0400 From: "Dan Langille" <dan@langille.org> To: freebsd-chat@freebsd.org Subject: Re: CVS log encoding (was Re: what are these characters please?) Message-ID: <20020412054009.744343F2D@bast.unixathome.org> In-Reply-To: <20020411185322.E34203F30@bast.unixathome.org> References: <a947md$2fli$1@kemoauc.mips.inka.de>
next in thread | previous in thread | raw e-mail | index | archive | help
On 11 Apr 2002 at 14:52, Dan Langille wrote: > On 11 Apr 2002 at 15:11, Christian Weisgerber wrote: > > > > I think my goal here is remove all non-ISO-8859-1 characters from the > > > incoming cvs-all message. > > > > It makes more sense to clobber everything that isn't ASCII. > > > > chomp($line); > > $line ~= tr/\x09\x20-\x7E/?/c; # tab, printable ASCII I wound up using this: # # look for non-printable characters. # this shows you them: perl -le 'print map chr,0x20..0x7e' # if ($message =~ /[^\x0a\x09\x20-\x7E]/) { # we have messy characters in there $message =~ tr/\x0a\x09\x20-\x7E/?/c; $EncodingLosses = 'true'; } > > Putting a replacement character such as '?' or '#' there is probably less > > confusing than outright deleting the offending bytes. > > Good point. That will ease the manual fix-up process too. For those interested, please view http://test.freshports.org/devel/cvsweb/ and look at the commit for 10 Apr 2002 12:58:54. The encoding loss is indicated by the red graphic you see under the date. The message in question has not been manually amended. My thanks to those that helped understand and solve the problem. -- Dan Langille The FreeBSD Diary - http://freebsddiary.org/ - practical examples To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-chat" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020412054009.744343F2D>