From owner-freebsd-doc Wed May 19 13:44:18 1999 Delivered-To: freebsd-doc@freebsd.org Received: from nothing-going-on.demon.co.uk (nothing-going-on.demon.co.uk [193.237.89.66]) by hub.freebsd.org (Postfix) with ESMTP id D792E14DC4 for ; Wed, 19 May 1999 13:42:32 -0700 (PDT) (envelope-from nik@nothing-going-on.demon.co.uk) Received: (from nik@localhost) by nothing-going-on.demon.co.uk (8.9.2/8.9.2) id VAA70874; Wed, 19 May 1999 21:40:22 +0100 (BST) (envelope-from nik) Date: Wed, 19 May 1999 21:40:22 +0100 From: Nik Clayton To: Nik Clayton Cc: doc@freebsd.org, freebsd-translate@ngo.org.uk Subject: Re: FDP Directory Reorganisation Message-ID: <19990519214022.D60921@catkin.nothing-going-on.org> References: <19990513211458.B70767@catkin.nothing-going-on.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.95.4i In-Reply-To: <19990513211458.B70767@catkin.nothing-going-on.org>; from Nik Clayton on Thu, May 13, 1999 at 09:14:58PM +0100 Organization: Nik at home, where there's nothing going on Sender: owner-freebsd-doc@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Folks, Here's v2.0 of the re-org plan. I would've just sent a diff, but the diff is almost as big anyway. The main change is replacing lang/encoding with lang.encoding, based on the contents of /usr/share/locale. The articles/books/man split is still there, and will remain unless someone can come up with something better. I'm away from my mailbox from Friday night through to Tuesday night, so I expect to come back to a large discussion thread :-) If a consensus has been reached then I'll start sorting out the fine details of implementing it when I get back. N FDP Directory Reorganisation Nik Clayton nik@FreeBSD.ORG This is an attempt to put down all my thoughts about my plans for an FDP directory reorganisation down so they can be critiqued. Comments welcome to the mailing list please. _________________________________________________________________ Overview The FreeBSD Documentation Project (FDP) directory structure has grown haphazardly over time. This was tolerable when the FDP repository only contained English versions of the documentation. However, as more translations are added to the repository it becomes important to have a consistent directory naming scheme followed by each translation. A consistent directory naming scheme will make it easier to write software that can automatically process FDP documentation without needing to be configured as to exactly where that documentation is in the tree; automated tools will be able to deduce this. Moving existant content that conflicts with this scheme will make automated tools simpler, as they will not need to handle exceptions to the rules. Finally, a consistent approach is much easier to document and to learn. Anything that can reduce the learning curve required before people can contribute to the FDP is a good thing. _________________________________________________________________ Current situation At the time of writing, the doc/ repository contains the following directories (ignoring empty directories); doc/ FAQ/ en/ handbook/ tutorials/ docproj-primer/ fonts/ ... share/ sgml/ es/ FAQ/ ja/ FAQ/ handbook/ man/ ru/ FAQ/ share/ sgml/ mk/ zh/ FAQ/ There are a number of anomalies and potential problems with this structure. It also gets a few things right. * doc/FAQ is out of place. It is the English version of the FreeBSD FAQ, and is a holdover from when the repository only contained the English documentation. * The English tutorials are one level lower in the tree than the English Handbook. Any commands to process the documentation that rely on relative paths will need to ensure that this is compensated for before running the command. See the current DOC_PREFIX kludge for an example of this. * Some of the documentation in tutorials/ should not be considered to be tutorials. A more neutral term would better describe the content. * No attempt is made to specify the character set used to write the documentation. While this is not a problem for the English translation, other languages, such as Japanese, Korean, and Chinese, have multiple character sets that could be used to encode the documentation. Some way of differentiating between these character sets should be provided, as should a mechanism for allowing multiple translations to the same language differing only in the choice of character set. * There is a proposed plan to split the Handbook up, and replace it with a number of smaller books with a tighter focus. The existing layout does not support this approach at all. * The use of share/ directories to contain files that are language neutral (in the first case) or can be used by all translations to a specific language (in the second case) is a good idea. _________________________________________________________________ The change Migrate to a new directory structure that follows this layout; doc/ lang.charset/ articles/ fonts/ ... books/ FAQ/ FDP-primer/ printing/ ... man/ ... share/ sgml/ share/ sgml/ ... mk/ ... The first top level directory represents both the language and the character set code used for this translation. These directory names come direct from /usr/share/locale, and examples include en_US.ISO_8859-1 and zh_CN.EUC. The language codes are defined in ISO639, which can be found in /usr/share/misc/iso639 on a relatively recent FreeBSD system. Yes, this is slight overkill. English, for example, could simply be left as en. However, this brings some benefits which I think are worthwhile. Firstly, the directory names will be completely consistent, both with one another, and with another, established directory hierarchy within FreeBSD. Continuing senseless incompatability is a bad idea. Secondly, for non-English users, the directory name should match their setting for the LANG environment variable. The second top level directory is share/, which will contain language neutral files. Below these directories, the documentation is categorised further. There are three categories that each document might be in; articles/ An article is a short piece of documentation (although ``short'' is a relative term). In general, if the documentation does not contain any chapters then it is an article, and should be placed in a subdirectory of this directory. ``article'' is a neutral term that does not convey information about about the nature of the information contained within the article (unlike ``tutorials''). Examples of existing documentation that would fall in to this category are; + Using FreeBSD with other Operating Systems + ``Making the world'' your own + This document. books/ Books are longer sets of documentation, characterised by their organisation in to multiple chapters. Examples of existing documentation that would fall in to this category are; + FreeBSD FAQ + FreeBSD Handbook + FDP Primer man/ The system manual pages, translated to the target language. While it is feasible that the English manual pages could move out of the src/ repository and in to doc/, I don't see this actually happening any time soon (certainly not within my life time). The historical pressure to keep them in src/ is too great. This directory will have the traditional mann directories, to further categorise the manual pages into their appropriate sections. share/ Content that can be shared between different documentation, but is language and character set specific. For example, as a translation team translates the documentation there will be sections that haven't been translated yet. You can put the translation of the phrase ``This section has not been translated yet'' into a file in this directory, and then use a general entity to include it in all the documentation where it is necessary. Why bother with the distinction between books and articles? We need something to distinguish between manual pages and everything else. Otherwise we would have this directory filled with one directory for piece of documentation, and then one more directory which would contain all the manual pages. This is not an appealing idea. So we need at least one directory to lump all the non-manual pages in to. Finding a useful name for this one directory is hard. tutorials is wrong, as many of them are not tutorials. docs is too non-specific (after all, the manual pages are ``docs'' as well). Trying to classify the documentation by its content is practically impossible. For example, would ``printing'' come under a hypothetical system administration section, or a using FreeBSD section, or would it have a section in its own right? The discussions would rage for days, and in many ways each point of view would be equally as valid. This approach neatly sidesteps all that, and provides a simple test to determine where a piece of documentation belongs. If it has chapters then it is a book, if it does not then it is an article. I am prepared to replace this with just one directory if someone can come up with a good name for it. Based on the current doc/, the converted directory structure will look like this. doc/ en_US.ISO_8859-1/ articles/ writing-device-drivers/ programming-tools/ formatting-media/ ... books/ FAQ/ FDP-primer/ handbook/ ... share/ sgml/ ja_JP.EUC/ books/ FAQ/ handbook/ ... man/ ... share/ sgml/ zh_CN.EUC/ books/ FAQ/ share/ sgml/ zh_TW.BIG5/ books/ FAQ/ share/ sgml/ fr_FR.ISO_8859-1/ books/ handbook/ share/ sgml/ ... share/ sgml/ mk/ That might change slightly. For example, if the French translations (which I'll be committing as soon as this directory re-org is out of the way) use Latin2 as the character set then the directory name becomes fr_FR.ISO_8859-2 instead. _________________________________________________________________ Making the change This is quite a large change, and will need careful thought about how to carry it out. In particular, we want to avoid bloating the CVS repository any more than we have to. How files are moved will depend on their current DTD. All documentation that is already marked up according to the DocBook DTD (and the manual pages) can be moved within the repository by the repository managers (Peter Wemm and John Polstra). Some of the Makefiles will then need small changes made to them to reflect the directory names, but that should be about all. All documentation that is marked up according to the LinuxDoc DTD is treated differently. The original files are left where they are. Then, when the documentation is converted to DocBook the original LinuxDoc files are left, and the new DocBook files will be stored in the new directories as appropriate. We will then have two versions of the document in the repository, one marked up in LinuxDoc, one in DocBook. The Makefiles can continue to point to the LinuxDoc version until the DocBook conversion has completed. When the DocBook conversion has been completed the LinuxDoc version can be removed. The conversion will be complete when the last piece of LinuxDoc documentation has been removed from the tree. _________________________________________________________________ Additional resources I've found the following links useful while trying to find out more information about i18n and l10n. http://czyborra.com/charsets/ Lots of information about different character sets, the iso8859* characters, and so on. http://www.ora.com/people/authors/lunde/cjk_inf.html The Chinese, Japanese, Korean information page has lots of information about how to encode these languages. http://www.vlsivie.tuwien.ac.at/mike/8bit/FAQ-ISO-8859-1 The ISO8859-1 FAQ contains useful inforamtion. -- There's some milk in the fridge about to go off. . . and there it goes. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-doc" in the body of the message