From owner-p4-projects@FreeBSD.ORG Tue Jun 24 16:30:44 2008 Return-Path: Delivered-To: p4-projects@freebsd.org Received: by hub.freebsd.org (Postfix, from userid 32767) id 4AAE6106567D; Tue, 24 Jun 2008 16:30:44 +0000 (UTC) Delivered-To: perforce@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0BFD2106567A; Tue, 24 Jun 2008 16:30:44 +0000 (UTC) (envelope-from kientzle@freebsd.org) Received: from kientzle.com (h-66-166-149-50.snvacaid.covad.net [66.166.149.50]) by mx1.freebsd.org (Postfix) with ESMTP id E18C58FC1D; Tue, 24 Jun 2008 16:30:43 +0000 (UTC) (envelope-from kientzle@freebsd.org) Received: from [10.0.0.128] (p54.kientzle.com [66.166.149.54]) by kientzle.com (8.12.9/8.12.9) with ESMTP id m5OGUhtv080970; Tue, 24 Jun 2008 09:30:43 -0700 (PDT) (envelope-from kientzle@freebsd.org) Message-ID: <48612133.5060309@freebsd.org> Date: Tue, 24 Jun 2008 09:30:43 -0700 From: Tim Kientzle User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.12) Gecko/20060422 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Gabor Kovesdan References: <200806241616.m5OGGCEr096087@repoman.freebsd.org> In-Reply-To: <200806241616.m5OGGCEr096087@repoman.freebsd.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Cc: Perforce Change Reviews Subject: Re: PERFORCE change 144026 for review X-BeenThere: p4-projects@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: p4 projects tree changes List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Jun 2008 16:30:44 -0000 Gabor, Unrelated, but I noticed that you have an unchecked call to mbstowcs() here. mbstowcs() can fail; I recently went through a couple months of pain reworking chunks of libarchive to correctly handle such failures. I ended up falling back on mbtowc() to convert one character at a time. You'll see conversion failures, for example, if someone is using a multi-character locale such as UTF-8 and runs grep over a file encoded in ISO-8859-1. (People often use "grep -R /usr/src" for example, and a lot of C source files have people's names in ISO-8859-1.) Throwing out the entire file (or even entire line) because of a single character that can't be interpreted is probably not going to be feasible. Tim Gabor Kovesdan wrote: > http://perforce.freebsd.org/chv.cgi?CH=144026 > > Change 144026 by gabor@gabor_server on 2008/06/24 16:15:17 > > - Cleanup: use grep_malloc instead of malloc > > Affected files ... > > .. //depot/projects/soc2008/gabor_textproc/grep/binary.c#10 edit > .. //depot/projects/soc2008/gabor_textproc/grep/grep.c#42 edit > .. //depot/projects/soc2008/gabor_textproc/grep/util.c#37 edit > > Differences ... > > ==== //depot/projects/soc2008/gabor_textproc/grep/binary.c#10 (text+ko) ==== > > @@ -77,8 +77,7 @@ > if ((s = mbstowcs(NULL, f->base, 0)) == -1) > return (0); > > - if ((wbuf = malloc((s + 1) * sizeof(wchar_t))) == NULL) > - err(2, NULL); > + wbuf = grep_malloc((s + 1) * sizeof(wchar_t)); > > mbstowcs(wbuf, f->base, s);