From owner-freebsd-i18n@FreeBSD.ORG Sat Apr 5 02:39:17 2014 Return-Path: Delivered-To: i18n@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7D785459; Sat, 5 Apr 2014 02:39:17 +0000 (UTC) Received: from mail.ignoranthack.me (ujvl.x.rootbsd.net [199.102.79.106]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 56138FAF; Sat, 5 Apr 2014 02:39:16 +0000 (UTC) Received: from [192.168.1.228] (c-24-23-221-252.hsd1.ca.comcast.net [24.23.221.252]) (using SSLv3 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: sbruno@ignoranthack.me) by mail.ignoranthack.me (Postfix) with ESMTPSA id 77CE01929C8; Sat, 5 Apr 2014 02:39:15 +0000 (UTC) Subject: Re: login.conf --> UTF-8 From: Sean Bruno To: Andrey Chernov In-Reply-To: <533F5DF5.9020803@freebsd.org> References: <1396457629.2280.2.camel@powernoodle.corp.yahoo.com> <20140402171546.GL44326@FreeBSD.org> <533C8269.7040305@freebsd.org> <20140404124634.GC44326@glebius.int.ru> <533F5DF5.9020803@freebsd.org> Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature"; boundary="=-JSIQsGw2JMYHc19g1Ve3" Date: Fri, 04 Apr 2014 19:39:13 -0700 Message-ID: <1396665553.2415.0.camel@powernoodle.corp.yahoo.com> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Cc: Gleb Smirnoff , i18n@freebsd.org, "freebsd-current@freebsd.org" X-BeenThere: freebsd-i18n@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list Reply-To: sbruno@freebsd.org List-Id: FreeBSD Internationalization Effort List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 Apr 2014 02:39:17 -0000 --=-JSIQsGw2JMYHc19g1Ve3 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sat, 2014-04-05 at 05:35 +0400, Andrey Chernov wrote: > On 04.04.2014 16:46, Gleb Smirnoff wrote: > > On Thu, Apr 03, 2014 at 01:34:33AM +0400, Andrey Chernov wrote: > > A> On 02.04.2014 21:15, Gleb Smirnoff wrote: > > A> > S> + :lang=3Den_US.UTF-8:\ > > A> > S> + :charset=3DUTF-8: > > A> >=20 > > A> > And I'd like to do same change for the 'russian' login class > > A> > in /etc/login.conf. > > A>=20 > > A> Please everybody remember that we don't have UTF-8 collation > > A> implemented, just fallback to bytecode comparison. > >=20 > > Any objections on checking in FreeBSD-compatible[1] UTF-8 collation > > implementation from Alex Tutubalin? > >=20 > > http://blog.lexa.ru/2008/03/03/freebsd_utf8_russian_collate_vtoraja_pop= itka.html > >=20 >=20 > Even his "version 2" have my objections. I already reply Alex about this > in 2008. In short: > 1) It is error there: almost all single chars above ASCII should be > "chains", i.t. two bytes minimum, since there almost no intersections > with ISO8859-1 as UTF-8 subset. > 2) The table itself is very incomplete, f.e. not covering either whole > KOI8-R, nor ISO8859-5, nor CP866. It is made from CP1251 with all its > restrictions. So, switching from f.e. KOI8-R to UTF-8 will cause sorting > regression. Russian UTF-8 collation should be able to sort all major > Russian charsets mentioned, i.e. we need combined table. > 3) "charmap map.ISO8859-1" declaration is missing (needed mainly for > using pure ASCII chars mnemonic names). >=20 > Even in case above mentioned errors will be removed and the code will be > committed afterwards, we should understand that this way (implementing > multibyte collation via single byte one) even while being possible is a > big hack and slowing sorting down up to 10 times. >=20 > Proper "Unicode collation algorithm" is already implemented by ICU and > other projects. See > http://unicode.org/reports/tr10/ > It will be better if someone adopt it instead of hacks. >=20 If you have a different patch, I'd appreciate seeing it. =20 Sean --=-JSIQsGw2JMYHc19g1Ve3 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJTP2y6AAoJEBkJRdwI6BaHU/UH/AyyXy6LyJLSZlYsuMMtbAYx HqUYt3k3gsLbfYXGyEJJNFwVBcPwDgUreiHlsQ35+Uiy5eROPyaumemNauS7YS8O xNyOJHiq/lQ2Rxk2aYEVX0IjrOiiZsm3n75h6qWHfIUyLsGtSPI0sJq0aiLjdJQ3 RHwmRCd1p8zluc17FDjYzoCRFtQpJne1Ttvz5L+0KF6uKPPx2obrvHVLrkQzZtgr /GAEJn7nDxC5yaAS317V97k1U/QC2XqMXUVOr8W77UbL4bEfq3IgGT1Wi697vBPl MGRP+hn5KLdXybJ20VKYCG+d4tfu/1sCbrayoN3HdpndQJpaZ6UYySa74bIhSV8= =0Aje -----END PGP SIGNATURE----- --=-JSIQsGw2JMYHc19g1Ve3--