Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 3 Aug 2012 15:19:44 +0100
From:      "Simon L. B. Nielsen" <simon@FreeBSD.org>
To:        =?UTF-8?Q?Ulrich_Sp=C3=B6rlein?= <uqs@freebsd.org>
Cc:        doc@freebsd.org, Gabor Kovesdan <gabor@freebsd.org>, www@freebsd.org
Subject:   Re: RFC: doc/www cleanup
Message-ID:  <CAC8HS2F=6=j58xG-AgErFvW=dtusBpa6RVyeUakH=XSfRxN3aA@mail.gmail.com>
In-Reply-To: <20120803141538.GG1202@acme.spoerlein.net>
References:  <501BAFBD.3010008@FreeBSD.org> <CAC8HS2E2ekMKJgY04qPrQGbEe_tPJ%2BHrf5_ToERptf0yawYoQA@mail.gmail.com> <20120803141538.GG1202@acme.spoerlein.net>

index | next in thread | previous in thread | raw e-mail

On Fri, Aug 3, 2012 at 3:15 PM, Ulrich Spörlein <uqs@freebsd.org> wrote:
> On Fri, 2012-08-03 at 14:33:04 +0100, Simon L. B. Nielsen wrote:
>> On Fri, Aug 3, 2012 at 12:02 PM, Gabor Kovesdan <gabor@freebsd.org> wrote:
>> > 2, Relaxing character entity usage: To be able to read non-ASCII characters
>> > on ASCII-only systems, we have been using character entities, like &aacute;.
>> > But in CJK languages, Greek and Russian every character is non-ASCII so
>> > practically they cannot be used nor were they used. So they are only used in
>> > ISO-8859 encodings (except Greek, which is also from this family). In fact,
>> > displaying these Latin-based characters nowadays isn't that problematic any
>> > more. Furthermore, if you edit text in a given language then we can suppose
>> > that you understand the language so you know what you should see and you
>> > know how to configure your system if you don't see the desired result. As a
>> > result, these entities nowadays don't have any real advantage any more but
>> > they highly "pollute" the text and make it much harder to edit and read. One
>>
>> I agree that the entities should generally not be used. I think we
>> should just switch to UTF-8 and charecterset wherever possible to
>> simplify it even more.
>>
>> And on that note, kill the useless character-set part of all our
>> language directories which generate horrible paths with no additional
>> value.
>>
>> > exception is using characters in a specific language that aren't present
>> > there, e.g. a non-English developer name in the English documentation, etc.
>>
>> UTF-8 would fix that.
>
> Last time I brought this up (trying to get rid of silly entities and
> the bogus charset name of the directories), I was told that our
> toolchain didn't fully grok UTF-8 yet, which was the reason we still had
> this de_DE.ISO8859-1 nonsense.

Ah, ok.

> The move to XML should really, really convert all files to UTF-8, drop
> that from the directories, and get rid of entities like &auml; or
> &eacute;, etc.o

Unfortunately I can only agree 100% ;-).

-- 
Simon L. B. Nielsen


help

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAC8HS2F=6=j58xG-AgErFvW=dtusBpa6RVyeUakH=XSfRxN3aA>