Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 27 Aug 2008 13:15:52 -0700
From:      "Loren M. Lang" <lorenl@north-winds.org>
To:        Alexander Churanov <alexanderchuranov@gmail.com>
Cc:        freebsd-i18n@freebsd.org
Subject:   Re: Unicode-based FreeBSD
Message-ID:  <1219868153.6962.37.camel@habakkuk.aloha.tallye.com>
In-Reply-To: <3cb459ed0808221700w335b0906g6901d8b8bec4dad9@mail.gmail.com>
References:  <3cb459ed0808221700w335b0906g6901d8b8bec4dad9@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--=-7CGkbxkHvJAXo4mK/kDD
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable

On Sat, 2008-08-23 at 04:00 +0400, Alexander Churanov wrote:
> Hi folks!
>=20
> I am interested in FreeBSD internationalization and unicode support. I
> already spent some time examining the source of syscons. I think that
> syscons is the main problem in bringing full UTF-8 support to FreeBSD out=
 of
> box. It seems that I am ready with the solution. That's why I am writing =
to
> this list.
>=20
> I have following questions:
>=20
> 0) Is moving to UTF-8 from 8-bit codepages desired for FreeBSD?

I would assume that the answer is "most definetly," but that's just my
assumption.

>=20
> 1) Is unicode support in character-mode (I mean plain tty, not Xorg) Free=
BSD
> human interface alreay implemented?

There are several levels I can see Unicode support being improved in
FreeBSD.  First of all, Text-based Unicode applications do work using a
Pseudo TTY such as via SSH from another machine or inside an X Terminal.
And, of course, GUI applications in Xorg have Unicode support.  Unicode
applications that are on the console (aka syscons) cannot use anything
outside of 7-bit US ASCII due to assumptions syscons.  Syscons assumes a
plain single byte 8-bit character set and that there is a one-to-one
mapping from a byte value to a character in the VGA font.  This also
means that syscons cannot utilize the full 256 font palette like DOS
could.  Syscons will need to be rewritten to interpret UTF-8 sequences
and store them internally, probably using UTF-16 or UTF-32 for
efficiency in lookups.  It will also need a more complex translator for
character to font glyphs ideally supporting a many to one table so that
combining characters and similar characters like =C3=9F (German SS) and =CE=
=B2
(Greek Beta) can be shown with the same glyph on the console.  The
current font format used by syscons is effectively a raw dump of the
font with no header information at all.  The font size (8x8, 8x14, 8x16)
is determined by the file size which only works if it's a full 256
character font.  I recommend using .psf font used by the Linux Console
as it is a much more feature complete format with full support for the
previously mentions Unicode character to font glyph mappings.

The second area that FreeBSD's Unicode needs improving is in the TTY
driver itself.  When the TTY driver is in canonical mode, it is the
kernel that handles how backspace and other simple editing functions
work.  Currently, it does not understand UTF-8 and has a similar
assumption of 8-bit character.  This does not effect applications that
use the TTY in raw mode such as libreadline based applications like bash
or (n)curses/slang based applications.  Simpler applications like the
basic bourne shell (sh) and applications that don't offer an interface
like grep, awk, sed when reading from the TTY cannot handle backspace.
This affects all TTY applications on FreeBSD, in or out of X.

The third area that FreeBSD might need some improvement is in libc.  I
am less familiar with this area so my information may be incorrect.
Basic locale and Unicode support exists in libc, but more advanced
functionality like character classes and collating needs work.  The
commands mklocale and colldef are used to create the appropriate binary
data files from source and, if I remember correctly, used a format which
is too simplified to fully support a modern Unicode specification.

>=20
> 2) Is somebody working on that?
>=20
> 3) What is the correct branch to check out source code? From what
> repository?
>=20
> 4) What is the process of submitting changes?
>=20
> Alexander Churanov
> _______________________________________________
> freebsd-i18n@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-i18n
> To unsubscribe, send any mail to "freebsd-i18n-unsubscribe@freebsd.org"

--=20
Loren M. Lang
lorenl@north-winds.org
http://www.north-winds.org/


Public Key: ftp://ftp.north-winds.org/pub/lorenl_pubkey.asc
Fingerprint: 10A0 7AE2 DAF5 4780 888A  3FA4 DCEE BB39 7654 DE5B


--=-7CGkbxkHvJAXo4mK/kDD
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQBItbX43O67OXZU3lsRAo27AJ44Mk0Zcdyh6cBDj+PS8Bw8RU0HcgCcDsGZ
7o53Fx5mQX1Ro43K5aYvm6A=
=/5D7
-----END PGP SIGNATURE-----

--=-7CGkbxkHvJAXo4mK/kDD--




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1219868153.6962.37.camel>