Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 28 Apr 2009 17:40:37 +0530
From:      Prashant Vaibhav <pvaibhav@freebsd.org>
To:        Gabor Kovesdan <gabor@freebsd.org>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: SoC 2009: BSD-licensed libiconv in base system
Message-ID:  <66b068eb0904280510g7a1e50dfm455d96fd49c6eae@mail.gmail.com>
In-Reply-To: <49F6C7A1.6070708@FreeBSD.org>
References:  <aa9f273a8313c6436e76fa9f5d587ef4.squirrel@webmail.kovesdan.org> <20090427183836.GA10793@zim.MIT.EDU> <49F5FE45.2090101@freebsd.org> <20090427193326.GA7654@britannica.bec.de> <20090427194904.GA11137@zim.MIT.EDU> <49F6C7A1.6070708@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
>
> My ex-girlfriend is working in Nepal [...] Even this country's encoding i=
s
> supported.


Probably because Nepali language doesn't have a separate script, they use
Devanagari!  :-)




On Tue, Apr 28, 2009 at 2:38 PM, Gabor Kovesdan <gabor@freebsd.org> wrote:

> David Schultz escribi=C3=B3:
>
>> On Mon, Apr 27, 2009, Joerg Sonnenberger wrote:
>>
>>
>>> On Mon, Apr 27, 2009 at 11:49:41AM -0700, Tim Kientzle wrote:
>>>
>>>
>>>> David Schultz wrote:
>>>>
>>>>
>>>>> ... whether it would make more sense to standardize on something like
>>>>> UCS-4 for the internal representation.
>>>>>
>>>>>
>>>> YES.  Without this, wchar_t is useless.
>>>>
>>>>
>>> I strongly disagree. Everything can be represented as UCS-4 is a bad
>>> assumption, but something Americans and Europeans naturally don't have
>>> to care about.
>>>
>>>
>>
>> ...but isn't this moot at present because there are no
>> widely-accepted encodings that include characters that
>> aren't supported by UCS-4? Citrus doesn't seem to support
>> any such encodings in any case.
>>
>>
> Citrus is based on UCS-4 as an internal encoding, just like the another
> BSD-licensed iconv library. This is a barrier to support encodings that
> aren't supported by UCS-4.
>
>> If this ever really becomes an issue, we could always stuff
>> locale-dependent encodings into unused UCS-4 code pages.
>> However, it doesn't seem worthwhile to deliberately burden
>> programmers over concerns that are presently, and for the
>> foreseeable future, hypothetical.
>>
>>
> I'm not a Unicode expert, but isn't the reason of periodical standard
> reviews and changes to cover more and more human languages? We could just
> support the latest Unicode standard and let the Unicode workgroups map th=
ose
> new characters into unused code points. The Latin-based, Cyrillic,
> Devanagari and CJK encodings are well-supported, I think. I don't know to=
o
> much about CJK encondings, though, if the thousands of ideographs are all
> supported or not. But I'd say the most significant languages that are use=
d
> on the Internet are supported, the rest might have another problems...
>
> [OFF]
> It's possible that there are little poor countries with an own writing
> system but probably their writing system is unsupported because the
> starvation, poorness and lack of water and electricity are more serious
> problems there. My ex-girlfriend is working in Nepal in a cooperation
> program (it's kinda scholarship) and she told me that they only have
> electricity in 8 hours a day, 4 during the night and 4 during the day. Th=
ere
> are no sidewalks for pedestrians, they go along with the cars on the stre=
et
> and the pollution is extremely high. Even this country's encoding is
> supported. What I am trying to say is that countries with unsupported
> languages probably won't really care about character encodings if they
> rarely have computers... I can just hope that their living conditions wil=
l
> get better and their language will be supported. I can also hope that the
> Unicode people will focus more on these countries instead of fucking up t=
he
> time with fictionary languages from fairy tales... [1]
> Probably I'll go to visit her in Nepal in January, it will be an
> interesting experience. I'll check if I can help the IT world there with
> anything.
> [ON]
>
> Another idea to consider. Are all of our utilities wchar-clean? What abou=
t
> library functions? (regex is surely not) Do we lack any important utility=
 or
> library? (we still do lack iconv and gettext and what else...?) What abou=
t
> standards, like C99 wchar functions? Is there something missing? What abo=
ut
> POSIX if it has something related? Personally, I think that these are mor=
e
> important questions than support of some extremely rare languages. It's
> worth to consider how to deal with them later but the basic problems need=
 a
> higher priority.
>
>
> [1] http://en.wikipedia.org/wiki/Tengwar#Unicode
>
>
> Cheers,
>
> --
> Gabor Kovesdan
> FreeBSD Volunteer
>
> EMAIL: gabor@FreeBSD.org .:|:. gabor@kovesdan.org
> WEB:   http://people.FreeBSD.org/~gabor .:|:. http://kovesdan.org
>
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org=
"
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?66b068eb0904280510g7a1e50dfm455d96fd49c6eae>