Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 May 1999 21:40:22 +0100
From:      Nik Clayton <nik@nothing-going-on.demon.co.uk>
To:        Nik Clayton <nik@nothing-going-on.demon.co.uk>
Cc:        doc@freebsd.org, freebsd-translate@ngo.org.uk
Subject:   Re: FDP Directory Reorganisation
Message-ID:  <19990519214022.D60921@catkin.nothing-going-on.org>
In-Reply-To: <19990513211458.B70767@catkin.nothing-going-on.org>; from Nik Clayton on Thu, May 13, 1999 at 09:14:58PM %2B0100
References:  <19990513211458.B70767@catkin.nothing-going-on.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Folks,

Here's v2.0 of the re-org plan.  I would've just sent a diff, but the diff
is almost as big anyway.

The main change is replacing lang/encoding with lang.encoding, based on
the contents of /usr/share/locale.  The articles/books/man split is 
still there, and will remain unless someone can come up with something
better.

I'm away from my mailbox from Friday night through to Tuesday night, so 
I expect to come back to a large discussion thread :-)  If a consensus
has been reached then I'll start sorting out the fine details of 
implementing it when I get back.

N

FDP Directory Reorganisation

Nik Clayton

   nik@FreeBSD.ORG
   
   This is an attempt to put down all my thoughts about my plans for an
   FDP directory reorganisation down so they can be critiqued. Comments
   welcome to the <doc@freebsd.org> mailing list please.
     _________________________________________________________________
   
Overview

   The FreeBSD Documentation Project (FDP) directory structure has grown
   haphazardly over time. This was tolerable when the FDP repository only
   contained English versions of the documentation. However, as more
   translations are added to the repository it becomes important to have
   a consistent directory naming scheme followed by each translation.
   
   A consistent directory naming scheme will make it easier to write
   software that can automatically process FDP documentation without
   needing to be configured as to exactly where that documentation is in
   the tree; automated tools will be able to deduce this. Moving existant
   content that conflicts with this scheme will make automated tools
   simpler, as they will not need to handle exceptions to the rules.
   
   Finally, a consistent approach is much easier to document and to
   learn. Anything that can reduce the learning curve required before
   people can contribute to the FDP is a good thing.
     _________________________________________________________________
   
Current situation

   At the time of writing, the doc/ repository contains the following
   directories (ignoring empty directories);
    doc/
        FAQ/
        en/
           handbook/
           tutorials/
                     docproj-primer/
                     fonts/
                     ...
           share/
                 sgml/
         es/
            FAQ/
         ja/
            FAQ/
            handbook/
            man/
         ru/
            FAQ/
         share/
               sgml/
               mk/
         zh/
            FAQ/

   There are a number of anomalies and potential problems with this
   structure. It also gets a few things right.
     * doc/FAQ is out of place. It is the English version of the FreeBSD
       FAQ, and is a holdover from when the repository only contained the
       English documentation.
     * The English tutorials are one level lower in the tree than the
       English Handbook. Any commands to process the documentation that
       rely on relative paths will need to ensure that this is
       compensated for before running the command. See the current
       DOC_PREFIX kludge for an example of this.
     * Some of the documentation in tutorials/ should not be considered
       to be tutorials. A more neutral term would better describe the
       content.
     * No attempt is made to specify the character set used to write the
       documentation. While this is not a problem for the English
       translation, other languages, such as Japanese, Korean, and
       Chinese, have multiple character sets that could be used to encode
       the documentation. Some way of differentiating between these
       character sets should be provided, as should a mechanism for
       allowing multiple translations to the same language differing only
       in the choice of character set.
     * There is a proposed plan to split the Handbook up, and replace it
       with a number of smaller books with a tighter focus. The existing
       layout does not support this approach at all.
     * The use of share/ directories to contain files that are language
       neutral (in the first case) or can be used by all translations to
       a specific language (in the second case) is a good idea.
     _________________________________________________________________
   
The change

   Migrate to a new directory structure that follows this layout;
    doc/
        lang.charset/
                     articles/
                              fonts/
                              ...
                     books/
                           FAQ/
                           FDP-primer/
                           printing/
                           ...
                     man/
                         ...
                     share/
                           sgml/
         share/
               sgml/
                    ...
               mk/
                  ...

   The first top level directory represents both the language and the
   character set code used for this translation. These directory names
   come direct from /usr/share/locale, and examples include
   en_US.ISO_8859-1 and zh_CN.EUC.
   
   The language codes are defined in ISO639, which can be found in
   /usr/share/misc/iso639 on a relatively recent FreeBSD system.
   
   Yes, this is slight overkill. English, for example, could simply be
   left as en. However, this brings some benefits which I think are
   worthwhile.
   
   Firstly, the directory names will be completely consistent, both with
   one another, and with another, established directory hierarchy within
   FreeBSD. Continuing senseless incompatability is a bad idea.
   
   Secondly, for non-English users, the directory name should match their
   setting for the LANG environment variable.
   
   The second top level directory is share/, which will contain language
   neutral files.
   
   Below these directories, the documentation is categorised further.
   There are three categories that each document might be in;
   
   articles/
          An article is a short piece of documentation (although
          ``short'' is a relative term). In general, if the documentation
          does not contain any chapters then it is an article, and should
          be placed in a subdirectory of this directory.
          
          ``article'' is a neutral term that does not convey information
          about about the nature of the information contained within the
          article (unlike ``tutorials'').
          
          Examples of existing documentation that would fall in to this
          category are;
          
          + Using FreeBSD with other Operating Systems
          + ``Making the world'' your own
          + This document.
            
   books/
          Books are longer sets of documentation, characterised by their
          organisation in to multiple chapters.
          
          Examples of existing documentation that would fall in to this
          category are;
          
          + FreeBSD FAQ
          + FreeBSD Handbook
          + FDP Primer
            
   man/
          The system manual pages, translated to the target language.
          
          While it is feasible that the English manual pages could move
          out of the src/ repository and in to doc/, I don't see this
          actually happening any time soon (certainly not within my life
          time). The historical pressure to keep them in src/ is too
          great.
          
          This directory will have the traditional mann directories, to
          further categorise the manual pages into their appropriate
          sections.
          
   share/
          Content that can be shared between different documentation, but
          is language and character set specific.
          
          For example, as a translation team translates the documentation
          there will be sections that haven't been translated yet. You
          can put the translation of the phrase ``This section has not
          been translated yet'' into a file in this directory, and then
          use a general entity to include it in all the documentation
          where it is necessary.
          
   Why bother with the distinction between books and articles?
   
   We need something to distinguish between manual pages and everything
   else. Otherwise we would have this directory filled with one directory
   for piece of documentation, and then one more directory which would
   contain all the manual pages. This is not an appealing idea.
   
   So we need at least one directory to lump all the non-manual pages in
   to. Finding a useful name for this one directory is hard. tutorials is
   wrong, as many of them are not tutorials. docs is too non-specific
   (after all, the manual pages are ``docs'' as well).
   
   Trying to classify the documentation by its content is practically
   impossible. For example, would ``printing'' come under a hypothetical
   system administration section, or a using FreeBSD section, or would it
   have a section in its own right? The discussions would rage for days,
   and in many ways each point of view would be equally as valid.
   
   This approach neatly sidesteps all that, and provides a simple test to
   determine where a piece of documentation belongs. If it has chapters
   then it is a book, if it does not then it is an article.
   
   I am prepared to replace this with just one directory if someone can
   come up with a good name for it.
   
   Based on the current doc/, the converted directory structure will look
   like this.
    doc/
        en_US.ISO_8859-1/
                        articles/
                                 writing-device-drivers/
                                 programming-tools/
                                 formatting-media/
                                 ...
                        books/
                              FAQ/
                              FDP-primer/
                              handbook/
                              ...
                        share/
                              sgml/
        ja_JP.EUC/
                  books/
                        FAQ/
                        handbook/
                        ...
                  man/
                      ...
                  share/
                        sgml/
        zh_CN.EUC/
                  books/
                        FAQ/
                  share/
                        sgml/
        zh_TW.BIG5/
                   books/
                         FAQ/
                   share/
                         sgml/
        fr_FR.ISO_8859-1/
                         books/
                               handbook/
                         share/
                               sgml/
        ...
        share/
              sgml/
              mk/

   That might change slightly. For example, if the French translations
   (which I'll be committing as soon as this directory re-org is out of
   the way) use Latin2 as the character set then the directory name
   becomes fr_FR.ISO_8859-2 instead.
     _________________________________________________________________
   
Making the change

   This is quite a large change, and will need careful thought about how
   to carry it out. In particular, we want to avoid bloating the CVS
   repository any more than we have to.
   
   How files are moved will depend on their current DTD.
   
   All documentation that is already marked up according to the DocBook
   DTD (and the manual pages) can be moved within the repository by the
   repository managers (Peter Wemm and John Polstra). Some of the
   Makefiles will then need small changes made to them to reflect the
   directory names, but that should be about all.
   
   All documentation that is marked up according to the LinuxDoc DTD is
   treated differently. The original files are left where they are. Then,
   when the documentation is converted to DocBook the original LinuxDoc
   files are left, and the new DocBook files will be stored in the new
   directories as appropriate. We will then have two versions of the
   document in the repository, one marked up in LinuxDoc, one in DocBook.
   The Makefiles can continue to point to the LinuxDoc version until the
   DocBook conversion has completed. When the DocBook conversion has been
   completed the LinuxDoc version can be removed.
   
   The conversion will be complete when the last piece of LinuxDoc
   documentation has been removed from the tree.
     _________________________________________________________________
   
Additional resources

   I've found the following links useful while trying to find out more
   information about i18n and l10n.
   
   http://czyborra.com/charsets/
          Lots of information about different character sets, the
          iso8859* characters, and so on.
          
   http://www.ora.com/people/authors/lunde/cjk_inf.html
          The Chinese, Japanese, Korean information page has lots of
          information about how to encode these languages.
          
   http://www.vlsivie.tuwien.ac.at/mike/8bit/FAQ-ISO-8859-1
          The ISO8859-1 FAQ contains useful inforamtion.
-- 
    There's some milk in the fridge about to go off. . . and there it goes.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-doc" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19990519214022.D60921>