From owner-p4-projects@FreeBSD.ORG Tue Jun 24 19:21:26 2008 Return-Path: Delivered-To: p4-projects@freebsd.org Received: by hub.freebsd.org (Postfix, from userid 32767) id 673461065676; Tue, 24 Jun 2008 19:21:26 +0000 (UTC) Delivered-To: perforce@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 28BE2106566C; Tue, 24 Jun 2008 19:21:26 +0000 (UTC) (envelope-from gabor@FreeBSD.org) Received: from viefep32-int.chello.at (viefep32-int.chello.at [62.179.121.50]) by mx1.freebsd.org (Postfix) with ESMTP id 48A6F8FC0A; Tue, 24 Jun 2008 19:21:24 +0000 (UTC) (envelope-from gabor@FreeBSD.org) Received: from [89.134.207.231] by viefep11-int.chello.at (InterMail vM.7.08.02.02 201-2186-121-104-20070414) with ESMTP id <20080624190611.LUHY5076.viefep11-int.chello.at@[89.134.207.231]>; Tue, 24 Jun 2008 21:06:11 +0200 Message-ID: <486145A2.8090409@FreeBSD.org> Date: Tue, 24 Jun 2008 21:06:10 +0200 From: Gabor Kovesdan User-Agent: Thunderbird 2.0.0.14 (Windows/20080421) MIME-Version: 1.0 To: Tim Kientzle References: <200806241616.m5OGGCEr096087@repoman.freebsd.org> <48612133.5060309@freebsd.org> In-Reply-To: <48612133.5060309@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Cc: Perforce Change Reviews Subject: Re: PERFORCE change 144026 for review X-BeenThere: p4-projects@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: p4 projects tree changes List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Jun 2008 19:21:26 -0000 Tim Kientzle escribió: > Gabor, > > Unrelated, but I noticed that you have an unchecked > call to mbstowcs() here. mbstowcs() can fail; I > recently went through a couple months of pain reworking > chunks of libarchive to correctly handle such failures. > I ended up falling back on mbtowc() to convert one > character at a time. Thanks Tim, I've forgotten about this check here, but yes, I understand its importance. > > You'll see conversion failures, for example, if > someone is using a multi-character locale such > as UTF-8 and runs grep over a file encoded in ISO-8859-1. > (People often use "grep -R /usr/src" for example, > and a lot of C source files have people's names > in ISO-8859-1.) > > Throwing out the entire file (or even entire line) > because of a single character that can't be > interpreted is probably not going to be feasible. I've tried it with LC_ALL=hu_HU.UTF-8 on ISO8859-[12] files and it still works. But to be sure, I'll just return 0 instead of calling err. Do you think it's ok, or do we need something special? Gabor