Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 24 Jun 2008 21:06:10 +0200
From:      Gabor Kovesdan <gabor@FreeBSD.org>
To:        Tim Kientzle <kientzle@freebsd.org>
Cc:        Perforce Change Reviews <perforce@freebsd.org>
Subject:   Re: PERFORCE change 144026 for review
Message-ID:  <486145A2.8090409@FreeBSD.org>
In-Reply-To: <48612133.5060309@freebsd.org>
References:  <200806241616.m5OGGCEr096087@repoman.freebsd.org> <48612133.5060309@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Tim Kientzle escribió:
> Gabor,
>
> Unrelated, but I noticed that you have an unchecked
> call to mbstowcs() here.  mbstowcs() can fail; I
> recently went through a couple months of pain reworking
> chunks of libarchive to correctly handle such failures.
> I ended up falling back on mbtowc() to convert one
> character at a time.
Thanks Tim, I've forgotten about this check here, but yes, I understand 
its importance.
>
> You'll see conversion failures, for example, if
> someone is using a multi-character locale such
> as UTF-8 and runs grep over a file encoded in ISO-8859-1.
> (People often use "grep -R <symbol> /usr/src" for example,
> and a lot of C source files have people's names
> in ISO-8859-1.)
>
> Throwing out the entire file (or even entire line)
> because of a single character that can't be
> interpreted is probably not going to be feasible.
I've tried it with LC_ALL=hu_HU.UTF-8 on ISO8859-[12] files and it still 
works. But to be sure, I'll just return 0 instead of calling err. Do you 
think it's ok, or do we need something special?

Gabor



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?486145A2.8090409>