Date: Wed, 27 Aug 2008 13:15:52 -0700 From: "Loren M. Lang" <lorenl@north-winds.org> To: Alexander Churanov <alexanderchuranov@gmail.com> Cc: freebsd-i18n@freebsd.org Subject: Re: Unicode-based FreeBSD Message-ID: <1219868153.6962.37.camel@habakkuk.aloha.tallye.com> In-Reply-To: <3cb459ed0808221700w335b0906g6901d8b8bec4dad9@mail.gmail.com> References: <3cb459ed0808221700w335b0906g6901d8b8bec4dad9@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--=-7CGkbxkHvJAXo4mK/kDD Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On Sat, 2008-08-23 at 04:00 +0400, Alexander Churanov wrote: > Hi folks! >=20 > I am interested in FreeBSD internationalization and unicode support. I > already spent some time examining the source of syscons. I think that > syscons is the main problem in bringing full UTF-8 support to FreeBSD out= of > box. It seems that I am ready with the solution. That's why I am writing = to > this list. >=20 > I have following questions: >=20 > 0) Is moving to UTF-8 from 8-bit codepages desired for FreeBSD? I would assume that the answer is "most definetly," but that's just my assumption. >=20 > 1) Is unicode support in character-mode (I mean plain tty, not Xorg) Free= BSD > human interface alreay implemented? There are several levels I can see Unicode support being improved in FreeBSD. First of all, Text-based Unicode applications do work using a Pseudo TTY such as via SSH from another machine or inside an X Terminal. And, of course, GUI applications in Xorg have Unicode support. Unicode applications that are on the console (aka syscons) cannot use anything outside of 7-bit US ASCII due to assumptions syscons. Syscons assumes a plain single byte 8-bit character set and that there is a one-to-one mapping from a byte value to a character in the VGA font. This also means that syscons cannot utilize the full 256 font palette like DOS could. Syscons will need to be rewritten to interpret UTF-8 sequences and store them internally, probably using UTF-16 or UTF-32 for efficiency in lookups. It will also need a more complex translator for character to font glyphs ideally supporting a many to one table so that combining characters and similar characters like =C3=9F (German SS) and =CE= =B2 (Greek Beta) can be shown with the same glyph on the console. The current font format used by syscons is effectively a raw dump of the font with no header information at all. The font size (8x8, 8x14, 8x16) is determined by the file size which only works if it's a full 256 character font. I recommend using .psf font used by the Linux Console as it is a much more feature complete format with full support for the previously mentions Unicode character to font glyph mappings. The second area that FreeBSD's Unicode needs improving is in the TTY driver itself. When the TTY driver is in canonical mode, it is the kernel that handles how backspace and other simple editing functions work. Currently, it does not understand UTF-8 and has a similar assumption of 8-bit character. This does not effect applications that use the TTY in raw mode such as libreadline based applications like bash or (n)curses/slang based applications. Simpler applications like the basic bourne shell (sh) and applications that don't offer an interface like grep, awk, sed when reading from the TTY cannot handle backspace. This affects all TTY applications on FreeBSD, in or out of X. The third area that FreeBSD might need some improvement is in libc. I am less familiar with this area so my information may be incorrect. Basic locale and Unicode support exists in libc, but more advanced functionality like character classes and collating needs work. The commands mklocale and colldef are used to create the appropriate binary data files from source and, if I remember correctly, used a format which is too simplified to fully support a modern Unicode specification. >=20 > 2) Is somebody working on that? >=20 > 3) What is the correct branch to check out source code? From what > repository? >=20 > 4) What is the process of submitting changes? >=20 > Alexander Churanov > _______________________________________________ > freebsd-i18n@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-i18n > To unsubscribe, send any mail to "freebsd-i18n-unsubscribe@freebsd.org" --=20 Loren M. Lang lorenl@north-winds.org http://www.north-winds.org/ Public Key: ftp://ftp.north-winds.org/pub/lorenl_pubkey.asc Fingerprint: 10A0 7AE2 DAF5 4780 888A 3FA4 DCEE BB39 7654 DE5B --=-7CGkbxkHvJAXo4mK/kDD Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQBItbX43O67OXZU3lsRAo27AJ44Mk0Zcdyh6cBDj+PS8Bw8RU0HcgCcDsGZ 7o53Fx5mQX1Ro43K5aYvm6A= =/5D7 -----END PGP SIGNATURE----- --=-7CGkbxkHvJAXo4mK/kDD--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1219868153.6962.37.camel>