FreeBSD Mail Archives

Date:      Wed, 5 Apr 2000 11:25:56 -0700 (PDT)
From:      "Eugene M. Kim" <ab@astralblue.com>
To:        Alex Belits <abelits@phobos.illtel.denver.co.us>
Cc:        Anatoly Vorobey <mellon@pobox.com>, hackers@FreeBSD.ORG
Subject:   Re: Unicode on FreeBSD
Message-ID:  <Pine.BSF.4.20.0004051101140.10555-100000@home.astralblue.com>
In-Reply-To: <Pine.LNX.4.20.0004050950210.11214-100000@phobos.illtel.denver.co.us>

On Wed, 5 Apr 2000, Alex Belits wrote:

| On Wed, 5 Apr 2000, Anatoly Vorobey wrote:
| 
| > > that the way that TeX handles such a text is even more inconvenient,
| > > however even now it's most likely that TeX would be used for this kind of
| > > typesetting.
| > 
| > But we're *not* talking about typesetting -- rather about multilingual 
| > text handling. TeX, indeed, does typesetting and thus solves the wrong 
| > problem.
| 
|   It solves exactly the same problem -- displaying information. Unicode
| does NOTHING to support any other functionality that is required for true
| multilingual text processing. You can't even do a hyphenation of unicode
| text -- you will have to guess, which language rules should apply.

Yet again, Unicode is _not_ a multilingual solution; it's rather an aid.  
There are literally tons of problems like case folding, hyphenation (as
you mentioned), Chinese simplified letter support, directionality,
ligature formation and so on, which must be flattened out by a true
multilingual solution.  The current Unicode try to address these
problems but it doesn't claim it solves or will solve them.  (Read the
Unicode spec carefully, please, and you will see what I mean.)

And this fact is actually a consensus which can be seen on any serious
i18n discussion group.

| 
| > In "real life" someone who needs to handle text with Russian 
| > and French in it -- type it, send it, read it, study it, etc. -- not 
| > *typeset* it -- won't use TeX for it, but will rather walk over to the 
| > Windows machine and fire up Word. This is the solution that's used in 
| > "real life" right now
| 
|   This only happens because those people use Word, and Word happens to use
| Unicode. Well, Word uses a lot of things that I consider to be stupid and
| poorly designed -- its popularity is based definitely not on technical
| merit.

We are speaking about the `real world' here -- the world as it is now.  
In other words, it's an installation base that we must consider in any
effort of envisioning a new i18n scheme.

| 
| > -- and incidentally, one of the reasons it's 
| > become so annoyingly common to email Word files as some kind of 
| > universal text standard.
| 
|   Word is not a standard, it's a format forced on a lot of people by some 
| pretty shady practice of certain company that in few recent days was
| mentioned often enough to make it pointless to be described again.
| 
| > I don't like this, but currently the Unix 
| > world doesn't have a good alternative to offer. UTF-8 changes that,
| > and I think that's a wonderful thing.
| 
|   UTF-8 provides a way to display a lot of characters -- that's all.

And that's exactly one of the things that we want as a part of
multilingual solution.

| And this is nowhere close to being enough -- if we want to be
| superior to pretty-pictures-oriented Windows software, we need to
| provide advantages over it, not absorb its weaknesses.  We need to
| provide multilingual functionality, not just multilingual display --
| if that will be done, half-assed languages support in Windows/Word
| will look like a sad joke.

Unicode doesn't necessarily mean a bad multilingual functionality.  
What makes a good m17n scheme is _how we use_ tools like Unicode (as a
multilingual display tool).  True, there's a lot of inappropriate m17n
approaches using Unicode (most of which directly project Unicode to be
the m17n scheme, as you pointed out), but so far I didn't see any factor
in Unicode itself that it will make it hard to envision a good m17n
scheme.

| 
| > It's fine for you to talk about
| > what would happen if MINE were to evolve into a general-purpose text-marking
| > standard powerful enough to handle a Czech word inside a French sentence,
| > but that didn't happen, which means that neither you nor anyone else took
| > it there. Frankly, I don't think MIME would have been up for the task 
| > anyway, but that's a moot point because it just didn't happen.
| 
|   What do you mean, "didn't happen"? Who is here writing software but we
| ourselves? I am trying to explain why the development in that area should
| be done despite stupid decisions made by IETF precisely because I expect
| it to be done as the result -- by myself or by others. I will be happy to
| start this work, however without others' input I am afraid that it will
| become yet another thing based on idiosyncrasy rather than on good design
| ideas -- sad example of Java makes me feel rather uneasy about starting a
| thing that no one seems to understand or care about.

If you don't feel right about the current approach by IETF, and you've
got enough confidence how it should be, then I strongly suggest you
start the work right away.  I think there are tons of programmers and
other experts, my being one of course, who will gladly give some input
towards this type of effort.  It's more of a matter of where to look for
those; for example, there are couple of m17n/i18n WGs in IETF where a
lot of people are willing to throw in their voice.

Regards,
Eugene

-- 
Eugene M. Kim <ab@astralblue.com>

"Is your music unpopular?  Make it popular; make music
which people like, or make people who like your music."



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.20.0004051101140.10555-100000>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation