Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 12 Jun 2001 13:35:32 +0400
From:      "Andrey A. Chernov" <ache@nagual.pp.ru>
To:        Joerg Wunsch <joerg_wunsch@uriah.heep.sax.de>
Cc:        freebsd-current@FreeBSD.ORG, i18n@FreeBSD.ORG
Subject:   Re: HEADS UP: locale names reorganization
Message-ID:  <20010612133532.B56905@nagual.pp.ru>
In-Reply-To: <20010612073257.B2752@uriah.heep.sax.de>; from j@uriah.heep.sax.de on Tue, Jun 12, 2001 at 07:32:57AM %2B0200
References:  <20010610163853.A1166@nagual.pp.ru> <200106101537.f5AFavo33433@mail.uic-in.net> <200106102136.f5ALawu94200@uriah.heep.sax.de> <20010611020547.A1379@nagual.pp.ru> <20010611082525.F94133@uriah.heep.sax.de> <20010611160108.A34164@nagual.pp.ru> <20010611212016.K94133@uriah.heep.sax.de> <20010611233423.A48057@nagual.pp.ru> <20010611235223.A48405@nagual.pp.ru> <20010612073257.B2752@uriah.heep.sax.de>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jun 12, 2001 at 07:32:57 +0200, Joerg Wunsch wrote:

> I thought of two components, like ru_RU, en_US, or de_DE.  AFAICT,
> that's the common practice i've seen in the other Unices.

Lets swith to i18n list, I Cc: here first time to notice -current readers.

> > But then we need to rewrite all old programs which parse LANG
> > directly to use nl_langinfo(CODESET).
> 
> I didn't know that programs parse the name of the locale directly, i
> always thought they just use the contents of files pointed to by this
> name (under /usr/share/locale/), i. e. the actual local name would be
> opaque to the application.

> Do you have an example of a program parsing the name?  I can't imagine
> right now who does it and why...

I can't remember much of them at this moment, but here are few examples:

Look at libreadline code in contrib f.e. Current readline version not
apply locale name parsing if setlocale() present, but old one do it.
Readline use it to determine that codeset is 8bit, I saw other usages
too.

Moreover, all perl scripts parsing LANG are automatically affected (i.e.
can't get codeset) because there is no nl_langinfo() in perl5 at all.

Example: catman (our)

Moreover, all shell scripts parsing LANG are affected, the same.

Example: neqn, nroff (3rd party)

Since there is no internal nl_langinfo() in shell/perl, external binary
utility must be used to pick codeset, commonly it is:
	locale charmap
call. We don't have this utility, so I object to short names until it will
be present (maybe in reduced to charmap arg only form, as call to
nl_langinfo(CODESET)).

> Apart from that, those old applications would fall over the locale
> name change anyway (e. g. they expect "de_DE.ISO_8859-1" but find
> "de_DE.ISO8859-1" which they are not prepared to handle), so that's
> really a good reason to introduce the lang_TERRITORY shorthands by the
> same time (i'm speaking of -current only!).

Most of parsing code compact codeset first, i.e. remove all "_" and
"-" characters from codeset, lowercase it, then compare.

-- 
Andrey A. Chernov
http://ache.pp.ru/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010612133532.B56905>