From owner-freebsd-questions@FreeBSD.ORG Sun Apr 22 14:30:26 2012 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6D08B106564A for ; Sun, 22 Apr 2012 14:30:26 +0000 (UTC) (envelope-from lars@larseighner.com) Received: from mail.team1internet.com (mail.team1internet.com [216.110.13.10]) by mx1.freebsd.org (Postfix) with ESMTP id 099268FC14 for ; Sun, 22 Apr 2012 14:30:26 +0000 (UTC) Received: from larseighner.com (unknown [216.110.13.72]) by mail.team1internet.com (Postfix) with SMTP id 048B816B4BB; Sun, 22 Apr 2012 09:30:18 -0500 (CDT) Received: by larseighner.com (nbSMTP-1.00) for uid 1001 lars@larseighner.com; Sun, 22 Apr 2012 09:29:32 -0500 (CDT) Date: Sun, 22 Apr 2012 09:29:30 -0500 (CDT) From: Lars Eighner X-X-Sender: lars@noos.6dollardialup.com To: Matthew Seaman In-Reply-To: <4F93E159.7020807@infracaninophile.co.uk> Message-ID: References: <20120421055823.GA6788@tinyCurrent> <4F9253D7.7010609@locolomo.org> <4F9278A2.1020301@locolomo.org> <4F93CC95.5050209@locolomo.org> <4F93E159.7020807@infracaninophile.co.uk> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="21774345-661023907-1335104970=:9143" Cc: freebsd-questions@freebsd.org Subject: Re: converting UTF-8 to HTML X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Apr 2012 14:30:26 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --21774345-661023907-1335104970=:9143 Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE On Sun, 22 Apr 2012, Matthew Seaman wrote: > On 22/04/2012 10:17, Erik N=F8rgaard wrote: >> UTF-8 is variable with, ascii characters are stored as single bytes (not >> sure about iso-8859-1) while other characters are stored as two byte cha= rs. > > ascii uses the low 128 values that you can assign to an unsigned char, > ie. those where the high-order bit is zero. > > Programming a text-only display to assume > everything is UTF-8 would be quite viable, and backwardly compatible > with ascii-only displays. The hardware doesn't exist to display UTF-8 characters in text MODE. The whole point of avoiding GUIs is rasterized and GUI fonts cannot put 4000 characters on a screen as legibly as VGA does (not to mention the performance hit the rasterization and GUIs deliver). One look at recent Linux distributions which make it all but impossible to reach text MODE because they had the thought that sticking a rasterized white-on-black font on the screen (via yet another kernel module) would be "just as good" as VGA should amply demonstrate the point. Yeah, you need that crap if you are running a server in Outer Fubaristan where there are 3= 8 languages written in 49 different alphabets -- but crippling text mode is not worth while for most people, especially people who work in text. --=20 Lars Eighner http://www.larseighner.com/index.html 8800 N IH35 APT 1191 AUSTIN TX 78753-5266 --21774345-661023907-1335104970=:9143--