From owner-freebsd-stable@FreeBSD.ORG Wed Mar 12 09:51:38 2014 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 36B4138E for ; Wed, 12 Mar 2014 09:51:38 +0000 (UTC) Received: from io.ze.tum.de (io.ze.tum.de [129.187.39.54]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id C4DEB397 for ; Wed, 12 Mar 2014 09:51:37 +0000 (UTC) Received: from etustar.ze.tum.de (etustar.ze.tum.de [129.187.39.96]) (authenticated bits=0) by io.ze.tum.de (8.14.5/8.14.5) with ESMTP id s2C9Vh8r057800 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Wed, 12 Mar 2014 10:31:43 +0100 (CET) (envelope-from estartu@ze.tum.de) Message-ID: <5320297F.1080400@ze.tum.de> Date: Wed, 12 Mar 2014 10:31:43 +0100 From: Gerhard Schmidt User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.1.0 MIME-Version: 1.0 To: stable@freebsd.org Subject: UTF-8 Sorting X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Mar 2014 09:51:38 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, I've a problem with FreeBSD, UTF-8 and Sorting. e.g. there is a file with the following content Meier Müller Öger Ofner Schmidt I have set my Terminal to ISO-8859-1 Encoding and call sort on this file I get the following output. Meier Müller Ofner Öger Schmidt Which is correctly sorted. When i change my Terminal to UTF-8 encoding and convert the file to UTF-8 and call sort again I get the following output. Meier Müller Ofner Schmidt Öger which is wrong. The problem seams to be that the LC_COLLATE file in the de_DE.UTF-8 locale is linked to ../la_LN.US-ASCII/LC_COLLATE (as are all LC_COLLATE Files in any UTF-8 locale). After some Research i found a Mail from Kuba Lida in December 2008 (yeah that's 5 Years ago) stating the same Problem and got no response. Why isn't there a UTF-8 LC_COLLATE file for any language. Kuba Lida believed there was a Problem with multibyte collate files in FreeBSD. Is this true and are there plans to fix this problem. The same test under Linux works without problem. Regards Estartu - -- - --------------------------------------------------------------------------- Gerhard Schmidt | http://www.augusta.de/~estartu | Fischbachweg 3 | | PGP Public Key 86856 Hiltenfingen | JabberID: estartu@augusta.de | on request Germany | | -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQCVAwUBUyApfwzx22nOTJQRAQJIbgP+MMSPepEsyG8Kx+QRDGJlfyQKK+r98/e+ ZiNPRMNjBpT7qrElJLvYfAuix3pOyqL3mq1DQJvZmqQxfoxEdy6GUf42i1Yk5gEX T05YtaeVRoXK/TetFt0UEcC3bXuXheu63aBpO4FU2v8CPTAyBwU6DUvV/v3AzXr6 j+mwws5n7so= =J2tH -----END PGP SIGNATURE-----