From owner-freebsd-java@FreeBSD.ORG Thu Jun 10 10:18:02 2004 Return-Path: Delivered-To: freebsd-java@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CB25016A4CE for ; Thu, 10 Jun 2004 10:18:02 +0000 (GMT) Received: from rambutan.pingpong.net (81.milagro.bahnhof.net [195.178.168.81]) by mx1.FreeBSD.org (Postfix) with ESMTP id B537F43D31 for ; Thu, 10 Jun 2004 10:18:01 +0000 (GMT) (envelope-from girgen@pingpong.net) Received: from localhost (localhost [127.0.0.1])i5AAHqvL084397; Thu, 10 Jun 2004 12:17:52 +0200 (CEST) (envelope-from girgen@pingpong.net) Date: Thu, 10 Jun 2004 12:17:52 +0200 From: Palle Girgensohn To: Palle Girgensohn , Greg Lewis Message-ID: <84B75B389C49D6FF3ED95F29@rambutan.pingpong.net> In-Reply-To: References: <5C024439534B293EAFE34A55@rambutan.pingpong.net> <20040609175626.GB83936@misty.eyesbeyond.com> X-Mailer: Mulberry/3.1.3 (Linux/x86) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15; format=flowed Content-Transfer-Encoding: quoted-printable Content-Disposition: inline cc: freebsd-java@freebsd.org Subject: Re: problems with java.util.zip and diacritical characters in file names X-BeenThere: freebsd-java@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Porting Java to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Jun 2004 10:18:03 -0000 I've tried this on Linux, seems to act in the same way. One problem is Java = converting the entries to unicode (this is NOT done by jazzlib, it seems to = keep the name in a byte array instead of a String). Anther problem is=20 winzip uses the character set cp850 (! I though this was dead for ages...), = so there really seems to be no hope unless I hack up jazzlib and convert=20 the file names somehow? /Palle --On Thursday, June 10, 2004 02:25:28 +0200 Palle Girgensohn=20 wrote: > Hi, > > Well, the problem is about character sets. A zip file seems to have no > attribute telling which charset it uses for representing file names. Not > very surprising. > > Java seems to handle this by reading filenames correctly and converting > them to java Strings (in unicode). But when fetching data, it uses the > unicode byte sequence to find and fetch the entry, and comes out empty > handed, the getInputString returns null. I know of no way to tell > java.util.zip that it should use some other character set? > > Hexdumping the resulting zip file, it is obvious that it has used unicode > in the zip file when saving the file name entries. I'm not sure how > winzip would react, but I assume it will show them as latin1, i.e. =E4 -> > =C3=A4. While this is really bad for me, since there is no standard I'm = not > quite sure this is wrong? > > BTW, there is a plugin pure java implementation on sourceforge, > . It seems to result in same filenames > on input and output. > > In (getName): z/ > Out (getName): z/ > In (getName): z/=E5=E4=F6=C5=C4=D6.txt > Out (getName): z/=E5=E4=F6=C5=C4=D6.txt > in is null > > with java.util.zip, in is null and the file is renamed to same thing but > in unicode, and is zero bytes in the zip file. > > with jazzlib, this seems to work, in is not null and the = =E5=E4=F6=C5=C4=D6.txt file > is not empty > > > I'm running this in a shell with > $ echo $LC_ALL > sv_SE.ISO8859-1 > > Regards, > Palle > > > --On onsdag, juni 09, 2004 11.56.26 -0600 Greg Lewis > wrote: > >> On Wed, Jun 09, 2004 at 05:37:27PM +0200, Palle Girgensohn wrote: >>> java.util.zip cannot inflate a zip archive that contains eight bit >>> characters in file names, it simply crashes. I haven't been able to try >>> it on ither platforms yet, but I'd like to hear from others who might >>> have seen this problem. Odd thing is there is no exception or anything >>> it just stops when the first character comes up, and returns null. >>> >>> Anyone else seen this? Is it just FreeBSD? >> >> If you send a small test programme and zip I can quickly try it on >> Linux to compare. >> >> -- >> Greg Lewis Email : glewis@eyesbeyond.com >> Eyes Beyond Web : http://www.eyesbeyond.com >> Information Technology FreeBSD : glewis@FreeBSD.org > > >