From owner-freebsd-apache@FreeBSD.ORG Fri May 20 11:12:05 2011 Return-Path: Delivered-To: freebsd-apache@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7264C1065670 for ; Fri, 20 May 2011 11:12:05 +0000 (UTC) (envelope-from f.bonnet@esiee.fr) Received: from hp9.esiee.fr (hp9.esiee.fr [147.215.1.4]) by mx1.freebsd.org (Postfix) with ESMTP id 26A498FC14 for ; Fri, 20 May 2011 11:12:04 +0000 (UTC) Received: from mail.esiee.fr (mail.esiee.fr [147.215.1.3]) by hp9.esiee.fr (Postfix) with ESMTP id 3B57214E9482; Fri, 20 May 2011 13:11:46 +0200 (CEST) X-DKIM: OpenDKIM Filter v2.3.2 hp9.esiee.fr 3B57214E9482 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=esiee.fr; s=MAILOUT; t=1305889906; bh=pWHu0Wh/VxKv8/Sa0SQ7QtAg8DoAJYYgJnPasObXRdU=; h=Message-ID:Date:From:MIME-Version:To:CC:Subject:References: In-Reply-To:Content-Type:Content-Transfer-Encoding; b=ztbYjTineAXkw3rhu2GY2pmczVH2ClZgALykMgDZ0fTLrcLNiipcKKWhwePGVaFPh LSUmuu2nJrmOHmio/QPROZFyPyPcwHgxg0QAUewLXDnbT+UdgXCi5mSH3NthL5PPAU rudJMn3Vy7dbZZ6evM/EU/IaUi6tximQKKf9kWYo= Received: from mail (localhost [127.0.0.1]) by VAMS.dummy (Postfix) with SMTP id 9EE6A105441F; Fri, 20 May 2011 13:12:03 +0200 (CEST) Received: from [147.215.1.21] (lisa.esiee.fr [147.215.1.21]) by mail.esiee.fr (Postfix) with ESMTP id 6B9A3105441E; Fri, 20 May 2011 13:12:03 +0200 (CEST) Message-ID: <4DD64C83.1070903@esiee.fr> Date: Fri, 20 May 2011 13:12:03 +0200 From: Frank Bonnet User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.17) Gecko/20110424 Thunderbird/3.1.10 MIME-Version: 1.0 To: Jeremy Chadwick References: <4DD624E4.5000408@esiee.fr> <20110520092755.GA18041@icarus.home.lan> <4DD63698.3030907@esiee.fr> <20110520103725.GA19494@icarus.home.lan> In-Reply-To: <20110520103725.GA19494@icarus.home.lan> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Cc: freebsd-apache@freebsd.org Subject: Re: Where to define HTTP_ACCEPT_LANGUAGE=fr-fr ??? X-BeenThere: freebsd-apache@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Support of apache-related ports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 May 2011 11:12:05 -0000 On 05/20/2011 12:37 PM, Jeremy Chadwick wrote: stuff deleted OK Jeremy, thank you for your complete and good technical answer, I'm gonna check all your recommendation then let you know if is has worked . Thanks again. Frank > here is the problem > This looks like a character set issue of the browser vs. the filename o= n > the server. Specifically: the browser is requesting to download a > filename that's in utf-8 (Unicode), while what's on the actual server i= s > a filename encoded in iso-8859-1. > > I'm also making the assumption the letter which shows up in your Email > above is actually the "=EF=BF=BD" character (latin small letter e with = an > acute (raising) accent above it). I hope the below examples therefore > render correctly for you. > > Let me explain the two differences: > > utf-8 > =3D=3D=3D=3D=3D=3D=3D > - Filename (visually): 11_EE_APP_FE_CV_CISSE_Kaliss=EF=BF=BD.docx > - Filename (literally): 11_EE_APP_FE_CV_CISSE_Kaliss<0xc3><0xa9>.docx > - Filename (as URL): 11_EE_APP_FE_CV_CISSE_Kaliss%C3%A9.docx > > iso-8859-1 > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > - Filename (visually): 11_EE_APP_FE_CV_CISSE_Kaliss=EF=BF=BD.docx > - Filename (literally): 11_EE_APP_FE_CV_CISSE_Kaliss<0xe9>.docx > - Filename (as URL): 11_EE_APP_FE_CV_CISSE_Kaliss%E9.docx > > URLs, per official RFC 1738, with regards to iso-8859-1, do not permit > characters above 0x7f to make it into the URL. So, technically > speaking, the URL of: > > http://somesite/11_EE_APP_FE_CV_CISSE_Kaliss=EF=BF=BD.docx > > Should fail or not work. Some browsers may try and "be smart" and turn > the accented small e character into %E9, which would then become: > > http://somesite/11_EE_APP_FE_CV_CISSE_Kaliss%E9.docx > > Which would work just fine. > > I'm not sure that HTTP_ACCEPT_LANGUAGE would fix this problem. > > If you have a CGI, PHP script, web software, etc. which is generating > filenames and things like that, and is using utf-8 as it's character se= t > (meaning either via an HTTP header or via HTML tag), > then that's going to mess things up. You need to be using the > iso-8859-1 character set instead. A good browser will be able to show > you what character set the page shows up as. > > What's the alternative? Simple: you start using utf-8 in your > filenames. I should note, however, that FreeBSD (including 8.2-STABLE) > does not have very good Unicode support. It's hit-or-miss, and using > things like LANG/LC_CTYPE result in some serious problems with utilitie= s > that rely on locale(7). So, I would be very careful going this route o= n > FreeBSD. > > The short version is this: if you're going to use utf-8, you need to us= e > it absolutely 100% of the time. You cannot reliably mix-match characte= r > sets like that. > > Hope this helps. >