From owner-freebsd-arch@FreeBSD.ORG Thu Jul 31 21:06:09 2014 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 65719E68; Thu, 31 Jul 2014 21:06:09 +0000 (UTC) Received: from mail.crittercasa.com (mail.turbofuzz.com [208.87.221.144]) by mx1.freebsd.org (Postfix) with ESMTP id 4D96527B7; Thu, 31 Jul 2014 21:06:08 +0000 (UTC) Received: from [10.2.0.80] (unknown [69.198.165.132]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.crittercasa.com (Postfix) with ESMTPS id 8B73E164882; Thu, 31 Jul 2014 14:00:21 -0700 (PDT) Mime-Version: 1.0 (Mac OS X Mail 8.0 \(1971.5\)) Subject: Re: XML Output: libxo - provide single API to output TXT, XML, JSON and HTML From: Jordan Hubbard In-Reply-To: <20140731205054.GT43962@funkthat.com> Date: Thu, 31 Jul 2014 14:00:41 -0700 Message-Id: <07AA5159-16A2-45D4-890D-4FB794F99B79@turbofuzz.com> References: <94841.1406796243@critter.freebsd.dk> <201407311531.s6VFVbJn094888@idle.juniper.net> <20140731205054.GT43962@funkthat.com> To: John-Mark Gurney X-Mailer: Apple Mail (2.1971.5) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18 Cc: sjg@freebsd.org, arch@freebsd.org, Poul-Henning Kamp , marcel@freebsd.org, Phil Shafer X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Jul 2014 21:06:09 -0000 > On Jul 31, 2014, at 1:50 PM, John-Mark Gurney = wrote: >=20 >> And moving toward UTF-8 won't be simple. I just tossed a couple >=20 > If we don't start, we won't ever move forward... Amen. If you look at $LANG on an OS X box, it=E2=80=99s been set to = en_US.UTF-8 for awhile, and that wasn=E2=80=99t without pain. We had to = deal with various performance regressions (8x hit to grep(1) over = ISO-Latin1!) and various other interoperability problems, but it was = ultimately like ripping off a band-aid - better done quickly than = slowly, and once done, the constant trickle of I18N bugs more or less = shut off entirely. To put it another way, if FreeBSD doesn=E2=80=99t do it, its downstream = vendors will have to. Everything from file names to filesystem names = need to be UTF-8 just so they display properly and Japanese users can = put their files on /mnt/=E3=83=9B=E3=83=BC=E3=83=A0/=E7=A7=81=E3=81=AE=E4=BC= =9A=E7=A4=BE/ Downstream here in FreeNAS-land, we get I18N requests = all the time, and it=E2=80=99s sure a pita when you have the various = technologies all *mostly* able to speak UTF-8 but there are routines = here and there that just don=E2=80=99t, and trying to make everything = work from a user land agent like Samba or Netatalk all the way down the = stack is just a PITA, but we still have to do it because it=E2=80=99s = not an all-english-all-the-time world we live in. I=E2=80=99d also not even worry about wide characters. They are a = historical artifact and not the direction everyone is going in. UTF-8 = FTW. - Jordan