Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 3 Aug 2012 15:19:44 +0100
From:      "Simon L. B. Nielsen" <simon@FreeBSD.org>
To:        =?UTF-8?Q?Ulrich_Sp=C3=B6rlein?= <uqs@freebsd.org>
Cc:        doc@freebsd.org, Gabor Kovesdan <gabor@freebsd.org>, www@freebsd.org
Subject:   Re: RFC: doc/www cleanup
Message-ID:  <CAC8HS2F=6=j58xG-AgErFvW=dtusBpa6RVyeUakH=XSfRxN3aA@mail.gmail.com>
In-Reply-To: <20120803141538.GG1202@acme.spoerlein.net>
References:  <501BAFBD.3010008@FreeBSD.org> <CAC8HS2E2ekMKJgY04qPrQGbEe_tPJ%2BHrf5_ToERptf0yawYoQA@mail.gmail.com> <20120803141538.GG1202@acme.spoerlein.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Aug 3, 2012 at 3:15 PM, Ulrich Sp=C3=B6rlein <uqs@freebsd.org> wrot=
e:
> On Fri, 2012-08-03 at 14:33:04 +0100, Simon L. B. Nielsen wrote:
>> On Fri, Aug 3, 2012 at 12:02 PM, Gabor Kovesdan <gabor@freebsd.org> wrot=
e:
>> > 2, Relaxing character entity usage: To be able to read non-ASCII chara=
cters
>> > on ASCII-only systems, we have been using character entities, like &aa=
cute;.
>> > But in CJK languages, Greek and Russian every character is non-ASCII s=
o
>> > practically they cannot be used nor were they used. So they are only u=
sed in
>> > ISO-8859 encodings (except Greek, which is also from this family). In =
fact,
>> > displaying these Latin-based characters nowadays isn't that problemati=
c any
>> > more. Furthermore, if you edit text in a given language then we can su=
ppose
>> > that you understand the language so you know what you should see and y=
ou
>> > know how to configure your system if you don't see the desired result.=
 As a
>> > result, these entities nowadays don't have any real advantage any more=
 but
>> > they highly "pollute" the text and make it much harder to edit and rea=
d. One
>>
>> I agree that the entities should generally not be used. I think we
>> should just switch to UTF-8 and charecterset wherever possible to
>> simplify it even more.
>>
>> And on that note, kill the useless character-set part of all our
>> language directories which generate horrible paths with no additional
>> value.
>>
>> > exception is using characters in a specific language that aren't prese=
nt
>> > there, e.g. a non-English developer name in the English documentation,=
 etc.
>>
>> UTF-8 would fix that.
>
> Last time I brought this up (trying to get rid of silly entities and
> the bogus charset name of the directories), I was told that our
> toolchain didn't fully grok UTF-8 yet, which was the reason we still had
> this de_DE.ISO8859-1 nonsense.

Ah, ok.

> The move to XML should really, really convert all files to UTF-8, drop
> that from the directories, and get rid of entities like &auml; or
> &eacute;, etc.o

Unfortunately I can only agree 100% ;-).

--=20
Simon L. B. Nielsen



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAC8HS2F=6=j58xG-AgErFvW=dtusBpa6RVyeUakH=XSfRxN3aA>