From owner-freebsd-bugbusters@FreeBSD.ORG Wed Mar 29 16:31:31 2006 Return-Path: X-Original-To: bugbusters@freebsd.org Delivered-To: freebsd-bugbusters@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C4B7216A423 for ; Wed, 29 Mar 2006 16:31:31 +0000 (UTC) (envelope-from dalias@aerifal.cx) Received: from brightrain.aerifal.cx (brightrain.aerifal.cx [216.12.86.13]) by mx1.FreeBSD.org (Postfix) with ESMTP id D2B7743D80 for ; Wed, 29 Mar 2006 16:31:27 +0000 (GMT) (envelope-from dalias@aerifal.cx) Received: from dalias by brightrain.aerifal.cx with local (Exim 3.15 #2) id 1FOdhC-0008SR-00 for bugbusters@freebsd.org; Wed, 29 Mar 2006 11:38:58 -0500 Date: Wed, 29 Mar 2006 11:38:58 -0500 To: bugbusters@freebsd.org Message-ID: <20060329163858.GA32369@brightrain.aerifal.cx> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.25i From: Rich Felker Cc: Subject: buf in libc/locale/utf8.c: typo->not quite one-to-one X-BeenThere: freebsd-bugbusters@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Coordination of the Problem Report handling effort." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Mar 2006 16:31:31 -0000 i'm not a freebsd user and haven't run any tests on this, but while reading the freebsd libc sources i found what appears to be a typo in utf8.c's mbrtowc implementation. for 6-byte sequences, the top 6 bits are compared against 111111 rather than the top 7 bits being compared against 1111110. this results in the illegal bytes fe and ff being treated the same as the legal bytes fc and fd. the fix is (sorry for my lame inline handmade patch): - } else if ((ch & 0xfc) == 0xfc) { + } else if ((ch & 0xfe) == 0xfc) { rich