Date: Wed, 19 May 1999 21:40:22 +0100 From: Nik Clayton <nik@nothing-going-on.demon.co.uk> To: Nik Clayton <nik@nothing-going-on.demon.co.uk> Cc: doc@freebsd.org, freebsd-translate@ngo.org.uk Subject: Re: FDP Directory Reorganisation Message-ID: <19990519214022.D60921@catkin.nothing-going-on.org> In-Reply-To: <19990513211458.B70767@catkin.nothing-going-on.org>; from Nik Clayton on Thu, May 13, 1999 at 09:14:58PM %2B0100 References: <19990513211458.B70767@catkin.nothing-going-on.org>
index | next in thread | previous in thread | raw e-mail
Folks,
Here's v2.0 of the re-org plan. I would've just sent a diff, but the diff
is almost as big anyway.
The main change is replacing lang/encoding with lang.encoding, based on
the contents of /usr/share/locale. The articles/books/man split is
still there, and will remain unless someone can come up with something
better.
I'm away from my mailbox from Friday night through to Tuesday night, so
I expect to come back to a large discussion thread :-) If a consensus
has been reached then I'll start sorting out the fine details of
implementing it when I get back.
N
FDP Directory Reorganisation
Nik Clayton
nik@FreeBSD.ORG
This is an attempt to put down all my thoughts about my plans for an
FDP directory reorganisation down so they can be critiqued. Comments
welcome to the <doc@freebsd.org> mailing list please.
_________________________________________________________________
Overview
The FreeBSD Documentation Project (FDP) directory structure has grown
haphazardly over time. This was tolerable when the FDP repository only
contained English versions of the documentation. However, as more
translations are added to the repository it becomes important to have
a consistent directory naming scheme followed by each translation.
A consistent directory naming scheme will make it easier to write
software that can automatically process FDP documentation without
needing to be configured as to exactly where that documentation is in
the tree; automated tools will be able to deduce this. Moving existant
content that conflicts with this scheme will make automated tools
simpler, as they will not need to handle exceptions to the rules.
Finally, a consistent approach is much easier to document and to
learn. Anything that can reduce the learning curve required before
people can contribute to the FDP is a good thing.
_________________________________________________________________
Current situation
At the time of writing, the doc/ repository contains the following
directories (ignoring empty directories);
doc/
FAQ/
en/
handbook/
tutorials/
docproj-primer/
fonts/
...
share/
sgml/
es/
FAQ/
ja/
FAQ/
handbook/
man/
ru/
FAQ/
share/
sgml/
mk/
zh/
FAQ/
There are a number of anomalies and potential problems with this
structure. It also gets a few things right.
* doc/FAQ is out of place. It is the English version of the FreeBSD
FAQ, and is a holdover from when the repository only contained the
English documentation.
* The English tutorials are one level lower in the tree than the
English Handbook. Any commands to process the documentation that
rely on relative paths will need to ensure that this is
compensated for before running the command. See the current
DOC_PREFIX kludge for an example of this.
* Some of the documentation in tutorials/ should not be considered
to be tutorials. A more neutral term would better describe the
content.
* No attempt is made to specify the character set used to write the
documentation. While this is not a problem for the English
translation, other languages, such as Japanese, Korean, and
Chinese, have multiple character sets that could be used to encode
the documentation. Some way of differentiating between these
character sets should be provided, as should a mechanism for
allowing multiple translations to the same language differing only
in the choice of character set.
* There is a proposed plan to split the Handbook up, and replace it
with a number of smaller books with a tighter focus. The existing
layout does not support this approach at all.
* The use of share/ directories to contain files that are language
neutral (in the first case) or can be used by all translations to
a specific language (in the second case) is a good idea.
_________________________________________________________________
The change
Migrate to a new directory structure that follows this layout;
doc/
lang.charset/
articles/
fonts/
...
books/
FAQ/
FDP-primer/
printing/
...
man/
...
share/
sgml/
share/
sgml/
...
mk/
...
The first top level directory represents both the language and the
character set code used for this translation. These directory names
come direct from /usr/share/locale, and examples include
en_US.ISO_8859-1 and zh_CN.EUC.
The language codes are defined in ISO639, which can be found in
/usr/share/misc/iso639 on a relatively recent FreeBSD system.
Yes, this is slight overkill. English, for example, could simply be
left as en. However, this brings some benefits which I think are
worthwhile.
Firstly, the directory names will be completely consistent, both with
one another, and with another, established directory hierarchy within
FreeBSD. Continuing senseless incompatability is a bad idea.
Secondly, for non-English users, the directory name should match their
setting for the LANG environment variable.
The second top level directory is share/, which will contain language
neutral files.
Below these directories, the documentation is categorised further.
There are three categories that each document might be in;
articles/
An article is a short piece of documentation (although
``short'' is a relative term). In general, if the documentation
does not contain any chapters then it is an article, and should
be placed in a subdirectory of this directory.
``article'' is a neutral term that does not convey information
about about the nature of the information contained within the
article (unlike ``tutorials'').
Examples of existing documentation that would fall in to this
category are;
+ Using FreeBSD with other Operating Systems
+ ``Making the world'' your own
+ This document.
books/
Books are longer sets of documentation, characterised by their
organisation in to multiple chapters.
Examples of existing documentation that would fall in to this
category are;
+ FreeBSD FAQ
+ FreeBSD Handbook
+ FDP Primer
man/
The system manual pages, translated to the target language.
While it is feasible that the English manual pages could move
out of the src/ repository and in to doc/, I don't see this
actually happening any time soon (certainly not within my life
time). The historical pressure to keep them in src/ is too
great.
This directory will have the traditional mann directories, to
further categorise the manual pages into their appropriate
sections.
share/
Content that can be shared between different documentation, but
is language and character set specific.
For example, as a translation team translates the documentation
there will be sections that haven't been translated yet. You
can put the translation of the phrase ``This section has not
been translated yet'' into a file in this directory, and then
use a general entity to include it in all the documentation
where it is necessary.
Why bother with the distinction between books and articles?
We need something to distinguish between manual pages and everything
else. Otherwise we would have this directory filled with one directory
for piece of documentation, and then one more directory which would
contain all the manual pages. This is not an appealing idea.
So we need at least one directory to lump all the non-manual pages in
to. Finding a useful name for this one directory is hard. tutorials is
wrong, as many of them are not tutorials. docs is too non-specific
(after all, the manual pages are ``docs'' as well).
Trying to classify the documentation by its content is practically
impossible. For example, would ``printing'' come under a hypothetical
system administration section, or a using FreeBSD section, or would it
have a section in its own right? The discussions would rage for days,
and in many ways each point of view would be equally as valid.
This approach neatly sidesteps all that, and provides a simple test to
determine where a piece of documentation belongs. If it has chapters
then it is a book, if it does not then it is an article.
I am prepared to replace this with just one directory if someone can
come up with a good name for it.
Based on the current doc/, the converted directory structure will look
like this.
doc/
en_US.ISO_8859-1/
articles/
writing-device-drivers/
programming-tools/
formatting-media/
...
books/
FAQ/
FDP-primer/
handbook/
...
share/
sgml/
ja_JP.EUC/
books/
FAQ/
handbook/
...
man/
...
share/
sgml/
zh_CN.EUC/
books/
FAQ/
share/
sgml/
zh_TW.BIG5/
books/
FAQ/
share/
sgml/
fr_FR.ISO_8859-1/
books/
handbook/
share/
sgml/
...
share/
sgml/
mk/
That might change slightly. For example, if the French translations
(which I'll be committing as soon as this directory re-org is out of
the way) use Latin2 as the character set then the directory name
becomes fr_FR.ISO_8859-2 instead.
_________________________________________________________________
Making the change
This is quite a large change, and will need careful thought about how
to carry it out. In particular, we want to avoid bloating the CVS
repository any more than we have to.
How files are moved will depend on their current DTD.
All documentation that is already marked up according to the DocBook
DTD (and the manual pages) can be moved within the repository by the
repository managers (Peter Wemm and John Polstra). Some of the
Makefiles will then need small changes made to them to reflect the
directory names, but that should be about all.
All documentation that is marked up according to the LinuxDoc DTD is
treated differently. The original files are left where they are. Then,
when the documentation is converted to DocBook the original LinuxDoc
files are left, and the new DocBook files will be stored in the new
directories as appropriate. We will then have two versions of the
document in the repository, one marked up in LinuxDoc, one in DocBook.
The Makefiles can continue to point to the LinuxDoc version until the
DocBook conversion has completed. When the DocBook conversion has been
completed the LinuxDoc version can be removed.
The conversion will be complete when the last piece of LinuxDoc
documentation has been removed from the tree.
_________________________________________________________________
Additional resources
I've found the following links useful while trying to find out more
information about i18n and l10n.
http://czyborra.com/charsets/
Lots of information about different character sets, the
iso8859* characters, and so on.
http://www.ora.com/people/authors/lunde/cjk_inf.html
The Chinese, Japanese, Korean information page has lots of
information about how to encode these languages.
http://www.vlsivie.tuwien.ac.at/mike/8bit/FAQ-ISO-8859-1
The ISO8859-1 FAQ contains useful inforamtion.
--
There's some milk in the fridge about to go off. . . and there it goes.
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-doc" in the body of the message
home |
help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19990519214022.D60921>
