Date: Fri, 3 Aug 2012 16:15:39 +0200 From: Ulrich =?utf-8?B?U3DDtnJsZWlu?= <uqs@FreeBSD.org> To: "Simon L. B. Nielsen" <simon@FreeBSD.org> Cc: doc@FreeBSD.org, Gabor Kovesdan <gabor@FreeBSD.org>, www@FreeBSD.org Subject: Re: RFC: doc/www cleanup Message-ID: <20120803141538.GG1202@acme.spoerlein.net> In-Reply-To: <CAC8HS2E2ekMKJgY04qPrQGbEe_tPJ%2BHrf5_ToERptf0yawYoQA@mail.gmail.com> References: <501BAFBD.3010008@FreeBSD.org> <CAC8HS2E2ekMKJgY04qPrQGbEe_tPJ%2BHrf5_ToERptf0yawYoQA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 2012-08-03 at 14:33:04 +0100, Simon L. B. Nielsen wrote: > On Fri, Aug 3, 2012 at 12:02 PM, Gabor Kovesdan <gabor@freebsd.org> wrote: > > 2, Relaxing character entity usage: To be able to read non-ASCII characters > > on ASCII-only systems, we have been using character entities, like á. > > But in CJK languages, Greek and Russian every character is non-ASCII so > > practically they cannot be used nor were they used. So they are only used in > > ISO-8859 encodings (except Greek, which is also from this family). In fact, > > displaying these Latin-based characters nowadays isn't that problematic any > > more. Furthermore, if you edit text in a given language then we can suppose > > that you understand the language so you know what you should see and you > > know how to configure your system if you don't see the desired result. As a > > result, these entities nowadays don't have any real advantage any more but > > they highly "pollute" the text and make it much harder to edit and read. One > > I agree that the entities should generally not be used. I think we > should just switch to UTF-8 and charecterset wherever possible to > simplify it even more. > > And on that note, kill the useless character-set part of all our > language directories which generate horrible paths with no additional > value. > > > exception is using characters in a specific language that aren't present > > there, e.g. a non-English developer name in the English documentation, etc. > > UTF-8 would fix that. Last time I brought this up (trying to get rid of silly entities and the bogus charset name of the directories), I was told that our toolchain didn't fully grok UTF-8 yet, which was the reason we still had this de_DE.ISO8859-1 nonsense. The move to XML should really, really convert all files to UTF-8, drop that from the directories, and get rid of entities like ä or é, etc.o Just my two cents Uli
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120803141538.GG1202>