Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 31 Jul 2014 14:00:41 -0700
From:      Jordan Hubbard <jkh@turbofuzz.com>
To:        John-Mark Gurney <jmg@funkthat.com>
Cc:        sjg@freebsd.org, arch@freebsd.org, Poul-Henning Kamp <phk@phk.freebsd.dk>, marcel@freebsd.org, Phil Shafer <phil@juniper.net>
Subject:   Re: XML Output: libxo - provide single API to output TXT, XML, JSON and HTML
Message-ID:  <07AA5159-16A2-45D4-890D-4FB794F99B79@turbofuzz.com>
In-Reply-To: <20140731205054.GT43962@funkthat.com>
References:  <94841.1406796243@critter.freebsd.dk> <201407311531.s6VFVbJn094888@idle.juniper.net> <20140731205054.GT43962@funkthat.com>

next in thread | previous in thread | raw e-mail | index | archive | help

> On Jul 31, 2014, at 1:50 PM, John-Mark Gurney <jmg@funkthat.com> =
wrote:
>=20
>> And moving toward UTF-8 won't be simple.  I just tossed a couple
>=20
> If we don't start, we won't ever move forward...

Amen.  If you look at $LANG on an OS X box, it=E2=80=99s been set to =
en_US.UTF-8 for awhile, and that wasn=E2=80=99t without pain.  We had to =
deal with various performance regressions (8x hit to grep(1) over =
ISO-Latin1!) and various other interoperability problems, but it was =
ultimately like ripping off a band-aid - better done quickly than =
slowly, and once done, the constant trickle of I18N bugs more or less =
shut off entirely.

To put it another way, if FreeBSD doesn=E2=80=99t do it, its downstream =
vendors will have to.  Everything from file names to filesystem names =
need to be UTF-8 just so they display properly and Japanese users can =
put their files on /mnt/=E3=83=9B=E3=83=BC=E3=83=A0/=E7=A7=81=E3=81=AE=E4=BC=
=9A=E7=A4=BE/    Downstream here in FreeNAS-land, we get I18N requests =
all the time, and it=E2=80=99s sure a pita when you have the various =
technologies all *mostly* able to speak UTF-8 but there are routines =
here and there that just don=E2=80=99t, and trying to make everything =
work from a user land agent like Samba or Netatalk all the way down the =
stack is just a PITA, but we still have to do it because it=E2=80=99s =
not an all-english-all-the-time world we live in.

I=E2=80=99d also not even worry about wide characters.  They are a =
historical artifact and not the direction everyone is going in.  UTF-8 =
FTW.

- Jordan




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?07AA5159-16A2-45D4-890D-4FB794F99B79>