Date: Tue, 28 Apr 2009 17:40:37 +0530 From: Prashant Vaibhav <pvaibhav@freebsd.org> To: Gabor Kovesdan <gabor@freebsd.org> Cc: freebsd-hackers@freebsd.org Subject: Re: SoC 2009: BSD-licensed libiconv in base system Message-ID: <66b068eb0904280510g7a1e50dfm455d96fd49c6eae@mail.gmail.com> In-Reply-To: <49F6C7A1.6070708@FreeBSD.org> References: <aa9f273a8313c6436e76fa9f5d587ef4.squirrel@webmail.kovesdan.org> <20090427183836.GA10793@zim.MIT.EDU> <49F5FE45.2090101@freebsd.org> <20090427193326.GA7654@britannica.bec.de> <20090427194904.GA11137@zim.MIT.EDU> <49F6C7A1.6070708@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
> > My ex-girlfriend is working in Nepal [...] Even this country's encoding i= s > supported. Probably because Nepali language doesn't have a separate script, they use Devanagari! :-) On Tue, Apr 28, 2009 at 2:38 PM, Gabor Kovesdan <gabor@freebsd.org> wrote: > David Schultz escribi=C3=B3: > >> On Mon, Apr 27, 2009, Joerg Sonnenberger wrote: >> >> >>> On Mon, Apr 27, 2009 at 11:49:41AM -0700, Tim Kientzle wrote: >>> >>> >>>> David Schultz wrote: >>>> >>>> >>>>> ... whether it would make more sense to standardize on something like >>>>> UCS-4 for the internal representation. >>>>> >>>>> >>>> YES. Without this, wchar_t is useless. >>>> >>>> >>> I strongly disagree. Everything can be represented as UCS-4 is a bad >>> assumption, but something Americans and Europeans naturally don't have >>> to care about. >>> >>> >> >> ...but isn't this moot at present because there are no >> widely-accepted encodings that include characters that >> aren't supported by UCS-4? Citrus doesn't seem to support >> any such encodings in any case. >> >> > Citrus is based on UCS-4 as an internal encoding, just like the another > BSD-licensed iconv library. This is a barrier to support encodings that > aren't supported by UCS-4. > >> If this ever really becomes an issue, we could always stuff >> locale-dependent encodings into unused UCS-4 code pages. >> However, it doesn't seem worthwhile to deliberately burden >> programmers over concerns that are presently, and for the >> foreseeable future, hypothetical. >> >> > I'm not a Unicode expert, but isn't the reason of periodical standard > reviews and changes to cover more and more human languages? We could just > support the latest Unicode standard and let the Unicode workgroups map th= ose > new characters into unused code points. The Latin-based, Cyrillic, > Devanagari and CJK encodings are well-supported, I think. I don't know to= o > much about CJK encondings, though, if the thousands of ideographs are all > supported or not. But I'd say the most significant languages that are use= d > on the Internet are supported, the rest might have another problems... > > [OFF] > It's possible that there are little poor countries with an own writing > system but probably their writing system is unsupported because the > starvation, poorness and lack of water and electricity are more serious > problems there. My ex-girlfriend is working in Nepal in a cooperation > program (it's kinda scholarship) and she told me that they only have > electricity in 8 hours a day, 4 during the night and 4 during the day. Th= ere > are no sidewalks for pedestrians, they go along with the cars on the stre= et > and the pollution is extremely high. Even this country's encoding is > supported. What I am trying to say is that countries with unsupported > languages probably won't really care about character encodings if they > rarely have computers... I can just hope that their living conditions wil= l > get better and their language will be supported. I can also hope that the > Unicode people will focus more on these countries instead of fucking up t= he > time with fictionary languages from fairy tales... [1] > Probably I'll go to visit her in Nepal in January, it will be an > interesting experience. I'll check if I can help the IT world there with > anything. > [ON] > > Another idea to consider. Are all of our utilities wchar-clean? What abou= t > library functions? (regex is surely not) Do we lack any important utility= or > library? (we still do lack iconv and gettext and what else...?) What abou= t > standards, like C99 wchar functions? Is there something missing? What abo= ut > POSIX if it has something related? Personally, I think that these are mor= e > important questions than support of some extremely rare languages. It's > worth to consider how to deal with them later but the basic problems need= a > higher priority. > > > [1] http://en.wikipedia.org/wiki/Tengwar#Unicode > > > Cheers, > > -- > Gabor Kovesdan > FreeBSD Volunteer > > EMAIL: gabor@FreeBSD.org .:|:. gabor@kovesdan.org > WEB: http://people.FreeBSD.org/~gabor .:|:. http://kovesdan.org > > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org= " >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?66b068eb0904280510g7a1e50dfm455d96fd49c6eae>