Date: Fri, 5 Nov 1999 08:18:42 +0000 From: Nik Clayton <nik@freebsd.org> To: Wolfram Schneider <wosch@cs.tu-berlin.de> Cc: doc@freebsd.org, wosch@freebsd.org Subject: Re: HTML to XML converter. Message-ID: <19991105081842.A88120@kilt.nothing-going-on.org> In-Reply-To: <19991104182818.A9400@freno.cs.tu-berlin.de>; from Wolfram Schneider on Thu, Nov 04, 1999 at 06:28:18PM %2B0100 References: <19991104182818.A9400@freno.cs.tu-berlin.de>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Nov 04, 1999 at 06:28:18PM +0100, Wolfram Schneider wrote: > I'm seeking a HTML to XML converter. Is this possible with the > FreeBSD sgml tools (jade, tiny etc.)? First off, that's not quite what you want. You want a converter to translate documents marked up in one DTD (HTML) to another DTD (which you haven't shown us, but I will be described in XML). There are three approaches you could use to do this. If you stick with the tools installed by the textproc/docproj port then Jade can translate files between two DTDs. That's how the DocBook to HTML conversion is done. Of course, you need to describe the mapping between the two DTDs, and in Jade you do that using a very Scheme-ish syntax. See all the files in $PREFIX/share/sgml/docbook/dsssl/modular/html/ for a (quite complicated) example. The second approach is to use a second language designed for this, called XSLT (XML Style Language for Transformations, or somesuch). You will still need to write the mapping between the two DTDs, but this time you use a much more procedural-like language (XSLT). I haven't played with this much myself, and the textproc/docproj won't install one by default. However, if you've got space to burn then look at textproc/lotusxsl (I don't have a ports tree to hand, so I might have got that reference wrong, grep for 'lotus' in /usr/ports/INDEX to be sure). The only snag with this approach is that most of the XSL parsers (including this one) are written in Java, so you're going to need the JDK (all 17MB of it, or thereabouts) installed first. The first two approaches have the advantage of being reasonably standard. You could migrate your XSLT stylesheets between different XSLT processors on different platforms, for example. If that's not important to you then investigate instant, which is probably part of textproc/sgmlformat. This was how we used to do DocBook to HTML conversions, and has a very simple language designed to do this and not much else. The snag is that the syntax is specific to instant, but it'll be by far the simplest approach. N -- A different "distribution" of Linux is really a different operating system. They just refuse to call it that because it's bad press. But that's what the shoe fits. -- Tom Christiansen, <199910211639.KAA18701@jhereg.perl.com> To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-doc" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19991105081842.A88120>