Date: Tue, 3 Aug 1999 16:33:16 +0100 From: Nik Clayton <nik@freebsd.org> To: Jeroen Ruigrok/Asmodai <asmodai@wxs.nl> Cc: Tim Vanderhoek <vanderh@ecf.utoronto.ca>, Greg Lehey <grog@lemis.com>, Mike Pritchard <mpp@mpp.pro-ns.net>, Bruce Evans <bde@zeta.org.au>, rnordier@nordier.com, doc@freebsd.org, nik@freebsd.org Subject: Re: cvs commit: src/sbin/disklabel disklabel.8 Message-ID: <19990803163315.D39416@kilt.nothing-going-on.org> In-Reply-To: <19990803083741.B58351@daemon.ninth-circle.org>; from Jeroen Ruigrok/Asmodai on Tue, Aug 03, 1999 at 08:37:41AM %2B0200 References: <199908010038.KAA16506@godzilla.zeta.org.au> <199908011141.GAA02125@mpp.pro-ns.net> <19990803113759.J62948@freebie.lemis.com> <19990802225533.A19050@mad> <19990803083741.B58351@daemon.ninth-circle.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Aug 03, 1999 at 08:37:41AM +0200, Jeroen Ruigrok/Asmodai wrote: > > Because using strictly mdoc will make it easier to change all the > > manpages to DocBook? > > Now this is interesting. I was hoping this particular can of worms wouldn't come up -- at least not for a while anyway. No such luck :-) [...] > I would like to hear some comments on this because it might truly be the > way to proceed on, but I still have some doubts and blanks about how to > realise a few goals. Here's my current take on manpages-in-DocBook -- this is not too well organised, as it's something I tend to think about for three or four minutes at a time, and then I start thinking about something (anything) more interesting instead. First of all, there is a precedent for having system manual pages marked up in DocBook (or DocBook-lite, or whatever). Sun's Solaris uses a variant of DocBook (called SolBook) which at least some of their manual pages are written in. I don't know what proportion of the pages this is though, or why Sun chose to do this. That said, I remain to be convinced that it would be a good idea for FreeBSD -- certainly for the near to middle future (over the next year or so, at least). One of the 'killer-apps' that's missing from DocBook is a good, standard, mechanism to go from DocBook to *roff based markup. We have DocBook to HTML, plain text, PostScript, PDF, and RTF, but not *roff. There are a number of approaches that could be used to tackle this problem. In decreasing order of complexity (and increasing order of desirability, at least to my mind) these are; 1. Write a program that does this, and nothing else. This might be in C, Python, Perl, or similar. The formatting rules would be embedded in the program, making it pretty useless as a general purpose formatter. This is the simplest approach, and also the least expandable. There already exist Perl implementations of this approach -- a DocBook RefEntry -> man page converter can be found on the web, probably somewhere under <URL:http://www.oasis-open.org/docbook/> 2. Try and find a program that can apply its own proprietry stylesheet or other formatting language to DocBook documents, and then write a stylesheet in this language, to go from DocBook to *roff. For example, instant (ports/textproc/instant) can do this. I'm not at a machine I can check this on, but I'm fairly certain there are instant(1) 'translation specifications' to go from DocBook to *roff. If not, it's pretty easy to write one. 3. Try and find a 'standard' stylesheet language, along with a processor for these stylesheets, and then write your stylesheets in this standard language, and hope for the best. There are three contenders for this approach. The first is Jade, which is what we're using to do the conversion to the other formats at the moment. Jade is moderately big, and (here's the kicker) can not currently produce *roff output, which pretty much removes it from consideration at the moment. Jade's stylesheets are written in a Lisp/Scheme-ish language called DSSSL, which some people find to be a turn off. I don't think Jade (or its successor, OpenJade) is likely to be able to do this. People have been talking about writing a *roff backend for Jade (which is written in C++, big, and not very well internally documented) for a year or more now, and no one's actually stepped up and done the work. The second approach uses XML and XSL. For the purposes of this discussion XML == SGML-lite, and XSL is a procedural stylesheet language (unlike DSSSL). In theory, we could convert the DocBook documents to XML (that's a no- brainer, and easy to do). We would then have to write some XSL stylesheets, and then run a hypothetical processor over the XML and the XSL to produce *roff. For this we need the XSL processor, which doesn't really exist yet -- also, the XSL language is in a state of flux at the moment. The third approach uses XML and XSLT. XSLT is a companion to XSL -- XSL is a 'style and formatting' language, it takes things like <sect1> <title>This is a title</title> <para>This is a para...</para> and converts that in to formatting instructions targetted at whatever output your producing (*roff commands, postscript code, and so on). XSLT, on the other hand, is used to transform (that's the 'T') documents from one DTD to another. So, if you wanted to go from DocBook to Postscript you'd write a stylesheet in XSL, but if you wanted to go from DocBook to HTML you'd write a stylesheet in XSLT (because DocBook and HTML are both DTDs, so going from one to the other is a transformation -- in the case of DocBook to HTML it's a lossy transformation). What I think we should have is a RoffDTD. This would be markup designed to capture the ins and outs of *roff markup. This DTD should be designed so that converting from RoffDTD to actual *roff markup is as simple as possible. This might be a bit too ambitious, so maybe an MDOCDTD, or similar instead -- whatever. The aim is to have a final DTD which can be used to markup documents which can then be easily converted to *roff. Then going from DocBook to *roff becomes a two step process; first you convert the document from the DocBook DTD to the RoffDTD (or MDOCDTD, or whatever). This transformation is carried out be a stylesheet written in XSLT (or, possibly, using Jade, which has an extension to DSSSL to support transforming from one DTD to another, which is how the DocBook to HTML conversion is carried out). Then you convert from the RoffDTD down to *roff markup, and process from there. This last approach is the most flexible -- it allows you to transform from arbitrary DTDs to RoffDTD, using any software that implements the standard XSLT language, and then to go from *roff with a final step that's hopefully quite simple. However, there are one or two problems with this; 1. No one's written RoffDTD yet -- I don't know *roff at all well, and a lot of the above is handwaving on my part -- I assume that it's possible to write a DTD that (a) accurately captures *roff formatting, (b) is easy to mechanically convert to the *roff formatting codes, but I have no actual proof of this. 2. I haven't found a light weight XSLT parser yet. All the ones I've looked at are written in Java, and I'm not for one moment suggesting that we bring the hulking behemoth that is Java in to the base system. As I type this, the 30MB or so of source code that's required as dependencies for the textproc/lotusxsl port is downloading in another window, and I wouldn't want to force that on anyone.[1] And, on top of that, I'm not really sure we need the system man pages in anything other than mdoc at the moment. About the only real benefit that I can think of is that it would make conversion to HTML a little simpler. But the toolchain really isn't there to support it yet. FWIW, Chuck Robey is working on a liteweight DocBook -> *roff formatter which probably falls in to category (1) above. I know he's very busy at the moment -- he'll probably drop a note in on this discussion if he's got the time, but if he doesn't then it's probably best not to bother him at the moment. This might well fill the gap sufficiently that starting to contemplate a migration from mdoc to DocBook would be worthwhile, but it's certainly a few months away from completion. So, to sum up -- man pages in DocBook is a nice idea, but I don't think it's of overwhelming importance yet. The toolchain isn't there, and there are lots of other things to do on the documentation that are (IMHO) more pressing. N [1] Just in case anyone's wondering why; I purchased a Palm Pilot recently, and I'm spending a little bit of spare time investigating how hard it would be to get documents like the FAQ on to the Pilot. More news as and when. -- [intentional self-reference] can be easily accommodated using a blessed, non-self-referential dummy head-node whose own object destructor severs the links. -- Tom Christiansen in <375143b5@cs.colorado.edu> To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-doc" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19990803163315.D39416>