Date: Mon, 23 Feb 2004 14:26:25 -0500 From: Chuck Swiger <cswiger@mac.com> To: =?ISO-8859-1?Q?Dag-Erling_Sm=F8rgrav?= <des@des.no> Cc: Alex Dupre <ale@FreeBSD.org> Subject: Re: Validating docbook articles... Message-ID: <403A53E1.2040305@mac.com> In-Reply-To: <xzpd686huyw.fsf@dwp.des.no> References: <8D03FA54-4BA6-11D8-8D97-003065ABFD92@pkix.net> <20040216130659.GC617@submonkey.net> <4031364A.2070708@pkix.net> <20040222181114.GB32524@graf.pompo.net> <40390248.1060104@pkix.net> <4039D0FE.3010905@FreeBSD.org> <xzpd686huyw.fsf@dwp.des.no>
next in thread | previous in thread | raw e-mail | index | archive | help
Dag-Erling Smørgrav wrote: > Alex Dupre <ale@FreeBSD.org> writes: >> [ ...talking about -preserve in tidy... ] > This reminds me of the many good reasons to convert the doc tree to > XML. One of these is that xmllint can both validate input files and > clean up output files, and it does a far better job of it than tidy. An interesting idea. I took a quick look at converting an existing SGML document into XML in order to gain some idea as to the work involved. Given an SGML prologue of: <!DOCTYPE article PUBLIC "-//FreeBSD//DTD DocBook V4.1-Based Extension//EN" [ <!ENTITY % man PUBLIC "-//FreeBSD//ENTITIES DocBook Manual Page Entities//EN"> %man; <!ENTITY % freebsd PUBLIC "-//FreeBSD//ENTITIES DocBook Miscellaneous FreeBSD Entities//EN"> %freebsd; <!ENTITY % trademarks PUBLIC "-//FreeBSD//ENTITIES DocBook Trademark Entities//EN"> %trademarks; ]> ...from doc/en_US.ISO8859-1/articles/filtering-bridges (written by ale@, of course :-), it's easy to add an XML prologue-- this could be done automaticly-- and "make lint" works just fine with an XML declaration in place. So far, so good. How does one generate proper SystemLiterals per: |4.2.2 External Entities | |[Definition: If the entity is not internal, it is an external entity, |declared as follows:] | |External Entity Declaration | |[75] ExternalID ::= 'SYSTEM' S SystemLiteral | | 'PUBLIC' S PubidLiteral S SystemLiteral 69-sec% xmllint article.sgml article.sgml:3: parser error : SystemLiteral " or ' expected <!DOCTYPE article PUBLIC "-//FreeBSD//DTD DocBook V4.1-Based Extension//EN" [ ^ article.sgml:3: parser error : SYSTEM or PUBLIC, the URI is missing <!DOCTYPE article PUBLIC "-//FreeBSD//DTD DocBook V4.1-Based Extension//EN" [ ^ article.sgml:4: parser error : Space required after the Public Identifier <!ENTITY % man PUBLIC "-//FreeBSD//ENTITIES DocBook Manual Page Entities//EN"> ^ article.sgml:4: parser error : SystemLiteral " or ' expected <!ENTITY % man PUBLIC "-//FreeBSD//ENTITIES DocBook Manual Page Entities//EN"> ^ article.sgml:4: parser error : SYSTEM or PUBLIC, the URI is missing <!ENTITY % man PUBLIC "-//FreeBSD//ENTITIES DocBook Manual Page Entities//EN"> ^ article.sgml:5: parser warning : PEReference: %man; not found %man; ^ [ ... ] Are these entities published via a URI, or does one need to refer to a local path? Is there a tool to update (normalize?) these ENTITY declarations automaticly, as using "xmllint --catalogs --loaddtd" didn't seem to help? Maybe this seems trivial, but there are several hundred SGML source files which would all need to be updated this way... -- -Chuck
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?403A53E1.2040305>