From owner-freebsd-chat Thu Apr 11 22:39:15 2002 Delivered-To: freebsd-chat@freebsd.org Received: from bast.unixathome.org (bast.unixathome.org [216.187.105.150]) by hub.freebsd.org (Postfix) with ESMTP id 85C2337B400 for ; Thu, 11 Apr 2002 22:39:12 -0700 (PDT) Received: from wocker (wocker.unixathome.org [192.168.0.99]) by bast.unixathome.org (Postfix) with ESMTP id 744343F2D for ; Fri, 12 Apr 2002 01:40:09 -0400 (EDT) From: "Dan Langille" Organization: DVL Software Limited To: freebsd-chat@freebsd.org Date: Fri, 12 Apr 2002 01:39:05 -0400 MIME-Version: 1.0 Subject: Re: CVS log encoding (was Re: what are these characters please?) Reply-To: dan@langille.org References: In-reply-to: <20020411185322.E34203F30@bast.unixathome.org> X-mailer: Pegasus Mail for Windows (v4.01) Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7BIT Content-description: Mail message body Message-Id: <20020412054009.744343F2D@bast.unixathome.org> Sender: owner-freebsd-chat@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On 11 Apr 2002 at 14:52, Dan Langille wrote: > On 11 Apr 2002 at 15:11, Christian Weisgerber wrote: > > > > I think my goal here is remove all non-ISO-8859-1 characters from the > > > incoming cvs-all message. > > > > It makes more sense to clobber everything that isn't ASCII. > > > > chomp($line); > > $line ~= tr/\x09\x20-\x7E/?/c; # tab, printable ASCII I wound up using this: # # look for non-printable characters. # this shows you them: perl -le 'print map chr,0x20..0x7e' # if ($message =~ /[^\x0a\x09\x20-\x7E]/) { # we have messy characters in there $message =~ tr/\x0a\x09\x20-\x7E/?/c; $EncodingLosses = 'true'; } > > Putting a replacement character such as '?' or '#' there is probably less > > confusing than outright deleting the offending bytes. > > Good point. That will ease the manual fix-up process too. For those interested, please view http://test.freshports.org/devel/cvsweb/ and look at the commit for 10 Apr 2002 12:58:54. The encoding loss is indicated by the red graphic you see under the date. The message in question has not been manually amended. My thanks to those that helped understand and solve the problem. -- Dan Langille The FreeBSD Diary - http://freebsddiary.org/ - practical examples To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-chat" in the body of the message