Date: Fri, 04 Apr 2014 19:39:13 -0700 From: Sean Bruno <sbruno@ignoranthack.me> To: Andrey Chernov <ache@freebsd.org> Cc: Gleb Smirnoff <glebius@FreeBSD.org>, i18n@freebsd.org, "freebsd-current@freebsd.org" <freebsd-current@freebsd.org> Subject: Re: login.conf --> UTF-8 Message-ID: <1396665553.2415.0.camel@powernoodle.corp.yahoo.com> In-Reply-To: <533F5DF5.9020803@freebsd.org> References: <1396457629.2280.2.camel@powernoodle.corp.yahoo.com> <20140402171546.GL44326@FreeBSD.org> <533C8269.7040305@freebsd.org> <20140404124634.GC44326@glebius.int.ru> <533F5DF5.9020803@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--=-JSIQsGw2JMYHc19g1Ve3 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sat, 2014-04-05 at 05:35 +0400, Andrey Chernov wrote: > On 04.04.2014 16:46, Gleb Smirnoff wrote: > > On Thu, Apr 03, 2014 at 01:34:33AM +0400, Andrey Chernov wrote: > > A> On 02.04.2014 21:15, Gleb Smirnoff wrote: > > A> > S> + :lang=3Den_US.UTF-8:\ > > A> > S> + :charset=3DUTF-8: > > A> >=20 > > A> > And I'd like to do same change for the 'russian' login class > > A> > in /etc/login.conf. > > A>=20 > > A> Please everybody remember that we don't have UTF-8 collation > > A> implemented, just fallback to bytecode comparison. > >=20 > > Any objections on checking in FreeBSD-compatible[1] UTF-8 collation > > implementation from Alex Tutubalin? > >=20 > > http://blog.lexa.ru/2008/03/03/freebsd_utf8_russian_collate_vtoraja_pop= itka.html > >=20 >=20 > Even his "version 2" have my objections. I already reply Alex about this > in 2008. In short: > 1) It is error there: almost all single chars above ASCII should be > "chains", i.t. two bytes minimum, since there almost no intersections > with ISO8859-1 as UTF-8 subset. > 2) The table itself is very incomplete, f.e. not covering either whole > KOI8-R, nor ISO8859-5, nor CP866. It is made from CP1251 with all its > restrictions. So, switching from f.e. KOI8-R to UTF-8 will cause sorting > regression. Russian UTF-8 collation should be able to sort all major > Russian charsets mentioned, i.e. we need combined table. > 3) "charmap map.ISO8859-1" declaration is missing (needed mainly for > using pure ASCII chars mnemonic names). >=20 > Even in case above mentioned errors will be removed and the code will be > committed afterwards, we should understand that this way (implementing > multibyte collation via single byte one) even while being possible is a > big hack and slowing sorting down up to 10 times. >=20 > Proper "Unicode collation algorithm" is already implemented by ICU and > other projects. See > http://unicode.org/reports/tr10/ > It will be better if someone adopt it instead of hacks. >=20 If you have a different patch, I'd appreciate seeing it. =20 Sean --=-JSIQsGw2JMYHc19g1Ve3 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJTP2y6AAoJEBkJRdwI6BaHU/UH/AyyXy6LyJLSZlYsuMMtbAYx HqUYt3k3gsLbfYXGyEJJNFwVBcPwDgUreiHlsQ35+Uiy5eROPyaumemNauS7YS8O xNyOJHiq/lQ2Rxk2aYEVX0IjrOiiZsm3n75h6qWHfIUyLsGtSPI0sJq0aiLjdJQ3 RHwmRCd1p8zluc17FDjYzoCRFtQpJne1Ttvz5L+0KF6uKPPx2obrvHVLrkQzZtgr /GAEJn7nDxC5yaAS317V97k1U/QC2XqMXUVOr8W77UbL4bEfq3IgGT1Wi697vBPl MGRP+hn5KLdXybJ20VKYCG+d4tfu/1sCbrayoN3HdpndQJpaZ6UYySa74bIhSV8= =0Aje -----END PGP SIGNATURE----- --=-JSIQsGw2JMYHc19g1Ve3--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1396665553.2415.0.camel>