Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 31 Jul 2014 14:00:41 -0700
From:      Jordan Hubbard <jkh@turbofuzz.com>
To:        John-Mark Gurney <jmg@funkthat.com>
Cc:        sjg@freebsd.org, arch@freebsd.org, Poul-Henning Kamp <phk@phk.freebsd.dk>, marcel@freebsd.org, Phil Shafer <phil@juniper.net>
Subject:   Re: XML Output: libxo - provide single API to output TXT, XML, JSON and HTML
Message-ID:  <07AA5159-16A2-45D4-890D-4FB794F99B79@turbofuzz.com>
In-Reply-To: <20140731205054.GT43962@funkthat.com>
References:  <94841.1406796243@critter.freebsd.dk> <201407311531.s6VFVbJn094888@idle.juniper.net> <20140731205054.GT43962@funkthat.com>

next in thread | previous in thread | raw e-mail | index | archive | help


> On Jul 31, 2014, at 1:50 PM, John-Mark Gurney <jmg@funkthat.com> wrote:
> 
>> And moving toward UTF-8 won't be simple.  I just tossed a couple
> 
> If we don't start, we won't ever move forward...

Amen.  If you look at $LANG on an OS X box, it’s been set to en_US.UTF-8 for awhile, and that wasn’t without pain.  We had to deal with various performance regressions (8x hit to grep(1) over ISO-Latin1!) and various other interoperability problems, but it was ultimately like ripping off a band-aid - better done quickly than slowly, and once done, the constant trickle of I18N bugs more or less shut off entirely.

To put it another way, if FreeBSD doesn’t do it, its downstream vendors will have to.  Everything from file names to filesystem names need to be UTF-8 just so they display properly and Japanese users can put their files on /mnt/ホーム/私の会社/    Downstream here in FreeNAS-land, we get I18N requests all the time, and it’s sure a pita when you have the various technologies all *mostly* able to speak UTF-8 but there are routines here and there that just don’t, and trying to make everything work from a user land agent like Samba or Netatalk all the way down the stack is just a PITA, but we still have to do it because it’s not an all-english-all-the-time world we live in.

I’d also not even worry about wide characters.  They are a historical artifact and not the direction everyone is going in.  UTF-8 FTW.

- Jordan




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?07AA5159-16A2-45D4-890D-4FB794F99B79>