Date: Tue, 4 Apr 2000 21:41:49 -0500 From: "gh" <grasshacker@linkfast.net> To: <freebsd-hackers@freebsd.org> Subject: Re: Unicode on FreeBSD Message-ID: <002101bf9ea8$7e6e10a0$fc69a0d0@linkfast.net.linkfast.net> References: <Pine.LNX.4.20.0004041827290.11214-100000@phobos.illtel.denver.co.us>
next in thread | previous in thread | raw e-mail | index | archive | help
Regardless of how you feel about Unicode--whatever, just think of how horribly terrible things would be if people actually had to *speak* to one another. gah, the torture. ;-) Dan gh > On Tue, 4 Apr 2000, G. Adam Stanislav wrote: > > > On Tue, Apr 04, 2000 at 05:05:05PM -0700, Alex Belits wrote: > > > The existing "market" of multilingual application is so small, and it's > > >based on so simplistic requirements (to be able to display and print > > >characters, and make multilingual "web pages"), that even solution so much > > >flawed as standardization on Unicode can survive. Unicode is positioned as > > >the _replacement_ for languages/charsets handling infrastructure -- "we > > >know all the characters, so we can write all the words, right?". > > > > Not so. Unicode is a character map. One of many. It just happens to be > > the most inclusive one in existence. > > It is. However if you look at the current efforts of its "adoption", it > is not used as one. It's touted as the solution to all language-related > problems, as a replacement of language/charset labeling infrastructure > and as the necessary prerequisite for any multilingual text processing. > > [skipped] > > > It does not, for example, provide sorting order. It cannot. Unicode is > > not about linguistics, it is about mapping characters regardless of their > > use in specific languages. And different languages sort characters > > differently. For example, in Slovak, "ch" is considered a character > > which belongs after the "h". In other languages it is sorted differently. > > And in most languages, it is just two unrelated characters. > > This is the kind of work that currently nonexistent language support > infrastructure should do -- when some language is encountered in > "multilingual" document/protocol/... its name can be used to load the > procedures (in this case sorting but it may be hyphenation, phonetic > match, etc.) for that particular language, and if no matched language is > known or supported, data should be just left alone. The same > infrastructure can be designed to support charsets and encodings, doing > conversion between them (and unicode) only where possible and necessary, > and providing the text in either "original" or "preferred", "supported", > etc. encoding for the language for the particular operation that should be > performed on the text. If such thing will be implemented, all existing > charset-specific routines that now exist in various places, can be reused, > and compatibility with existing software can be achieved without any > significant pain. > > > Unicode is not simplistic. It does what its stated goal is, and it does > > it well. How we use it, is up to us. > > > > Cheers, > > Adam > > > > P.S. Hmmm... Interesting. I noticed my random quote contains a C-caron. > > I wonder how it is going to be handled. :) > > It was handled pretty well for such a primitive system as pine in > xterm. Since your charset was iso 8859-2, it was marked as such in > Content-Type header of the message. pine given me a warning: > > ---8<--- > [ The following text is in the "iso-8859-2" character set. ] > [ Your display is set for the "koi8-r" character set. ] > [ Some characters may be displayed incorrectly. ] > --->8--- > > and displayed the text. xterm used the default font that happened to be in > koi8-r charset, displaying C-caron as cyrillic ha. I have read the > warning, manually switched xterm to a font in iso 8859-2 charset, and text > was displayed correctly. If I used a gui-based MUA such as Netscape (what > I didn't because Netscape Messenger sucks for reasons that have nothing to > do with its charsets support), it would just display the message in the > charset defined in the header. > > -- > Alex > > ---------------------------------------------------------------------- > Excellent.. now give users the option to cut your hair you hippie! > -- Anonymous Coward > > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-hackers" in the body of the message > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?002101bf9ea8$7e6e10a0$fc69a0d0>
