Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 4 Apr 2000 21:41:49 -0500
From:      "gh" <grasshacker@linkfast.net>
To:        <freebsd-hackers@freebsd.org>
Subject:   Re: Unicode on FreeBSD
Message-ID:  <002101bf9ea8$7e6e10a0$fc69a0d0@linkfast.net.linkfast.net>
References:  <Pine.LNX.4.20.0004041827290.11214-100000@phobos.illtel.denver.co.us>

next in thread | previous in thread | raw e-mail | index | archive | help
Regardless of how you feel about Unicode--whatever, just think of how
horribly terrible things would be if people actually had to *speak* to one
another.
gah, the torture.

;-)

Dan
gh

> On Tue, 4 Apr 2000, G. Adam Stanislav wrote:
>
> > On Tue, Apr 04, 2000 at 05:05:05PM -0700, Alex Belits wrote:
> > >  The existing "market" of multilingual application is so small, and
it's
> > >based on so simplistic requirements (to be able to display and print
> > >characters, and make multilingual "web pages"), that even solution so
much
> > >flawed as standardization on Unicode can survive. Unicode is positioned
as
> > >the _replacement_ for languages/charsets handling infrastructure -- "we
> > >know all the characters, so we can write all the words, right?".
> >
> > Not so. Unicode is a character map. One of many. It just happens to be
> > the most inclusive one in existence.
>
>   It is. However if you look at the current efforts of its "adoption", it
> is not used as one. It's touted as the solution to all language-related
> problems, as a replacement of language/charset labeling infrastructure
> and as the necessary prerequisite for any multilingual text processing.
>
> [skipped]
>
> > It does not, for example, provide sorting order. It cannot. Unicode is
> > not about linguistics, it is about mapping characters regardless of
their
> > use in specific languages. And different languages sort characters
> > differently. For example, in Slovak, "ch" is considered a character
> > which belongs after the "h". In other languages it is sorted
differently.
> > And in most languages, it is just two unrelated characters.
>
>   This is the kind of work that currently nonexistent language support
> infrastructure should do -- when some language is encountered in
> "multilingual" document/protocol/... its name can be used to load the
> procedures (in this case sorting but it may be hyphenation, phonetic
> match, etc.) for that particular language, and if no matched language is
> known or supported, data should be just left alone. The same
> infrastructure can be designed to support charsets and encodings, doing
> conversion between them (and unicode) only where possible and necessary,
> and providing the text in either "original" or "preferred", "supported",
> etc. encoding for the language for the particular operation that should be
> performed on the text. If such thing will be implemented, all existing
> charset-specific routines that now exist in various places, can be reused,
> and compatibility with existing software can be achieved without any
> significant pain.
>
> > Unicode is not simplistic. It does what its stated goal is, and it does
> > it well. How we use it, is up to us.
> >
> > Cheers,
> > Adam
> >
> > P.S. Hmmm... Interesting. I noticed my random quote contains a C-caron.
> > I wonder how it is going to be handled. :)
>
>   It was handled pretty well for such a primitive system as pine in
> xterm. Since your charset was iso 8859-2, it was marked as such in
> Content-Type header of the message. pine given me a warning:
>
> ---8<---
>     [ The following text is in the "iso-8859-2" character set. ]
>     [ Your display is set for the "koi8-r" character set.  ]
>     [ Some characters may be displayed incorrectly. ]
> --->8---
>
> and displayed the text. xterm used the default font that happened to be in
> koi8-r charset, displaying C-caron as cyrillic ha. I have read the
> warning, manually switched xterm to a font in iso 8859-2 charset, and text
> was displayed correctly. If I used a gui-based MUA such as Netscape (what
> I didn't because Netscape Messenger sucks for reasons that have nothing to
> do with its charsets support), it would just display the message in the
> charset defined in the header.
>
> --
> Alex
>
> ----------------------------------------------------------------------
>  Excellent.. now give users the option to cut your hair you hippie!
>                                                   -- Anonymous Coward
>
>
>
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-hackers" in the body of the message
>



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?002101bf9ea8$7e6e10a0$fc69a0d0>