From owner-freebsd-doc@FreeBSD.ORG Mon Feb 23 14:07:28 2004 Return-Path: Delivered-To: freebsd-doc@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 33E6D16A4CE; Mon, 23 Feb 2004 14:07:28 -0800 (PST) Received: from out006.verizon.net (out006pub.verizon.net [206.46.170.106]) by mx1.FreeBSD.org (Postfix) with ESMTP id EC53943D1F; Mon, 23 Feb 2004 14:07:27 -0800 (PST) (envelope-from cswiger@mac.com) Received: from mac.com ([68.160.202.196]) by out006.verizon.net (InterMail vM.5.01.06.06 201-253-122-130-106-20030910) with ESMTP id <20040223220725.ONFU1634.out006.verizon.net@mac.com>; Mon, 23 Feb 2004 16:07:25 -0600 Message-ID: <403A7996.1090602@mac.com> Date: Mon, 23 Feb 2004 17:07:18 -0500 From: Chuck Swiger Organization: The Courts of Chaos User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040113 X-Accept-Language: en-us, en MIME-Version: 1.0 To: =?ISO-8859-1?Q?Dag-Erling_Sm=F8rgrav?= References: <8D03FA54-4BA6-11D8-8D97-003065ABFD92@pkix.net> <20040216130659.GC617@submonkey.net> <4031364A.2070708@pkix.net> <20040222181114.GB32524@graf.pompo.net> <40390248.1060104@pkix.net> <4039D0FE.3010905@FreeBSD.org> <403A53E1.2040305@mac.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Authentication-Info: Submitted using SMTP AUTH at out006.verizon.net from [68.160.202.196] at Mon, 23 Feb 2004 16:07:24 -0600 cc: freebsd-doc@FreeBSD.org cc: Thierry Thomas cc: Alex Dupre Subject: Re: Validating docbook articles... X-BeenThere: freebsd-doc@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Documentation project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Feb 2004 22:07:28 -0000 Dag-Erling Smørgrav wrote: > Chuck Swiger writes: >>How does one generate proper SystemLiterals per: >>[...] >>Are these entities published via a URI, or does one need to refer to a >>local path? > > The system literal can be anything as long as you have a catalog that > reveals the real location of the external entity. The usual practice > for entities that rarely change is to create an online repository and > let the system literal point to that. In this case though you might > as well use an empty or intentionally meaningless string. Hmm. Thanks for the response, which is helpful but seems incomplete from the standpoint of compatibility with the existing SGML build using nsgmls. Specificly, I can add a "" pair as the system literal, but xmllint complains about an invalid URI, and nsgmls isn't any happier. Using file URIs works for xmllint, but not for nsgmls; using raw pathnames almost works for both, ie something like: %man; %freebsd; %trademarks; ]> ...(rather than "file:///usr/doc...") results in: 170-sec% make lint /usr/local/bin/nsgmls -wempty -wunclosed -s -D /usr/obj/usr/doc/en_US.ISO8859-1/articles/fb -c /usr/doc/en_US.ISO8859-1/share/sgml/catalog -c /usr/doc/share/sgml/catalog -c /usr/local/share/sgml/iso8879/catalog -c /usr/local/share/sgml/jade/catalog -c /usr/local/share/sgml/catalog.ports /usr/doc/en_US.ISO8859-1/articles/fb/article.sgml /usr/local/bin/nsgmls:/usr/doc/en_US.ISO8859-1/articles/fb/article.sgml:173:17:E : element "DEVICENAME" undefined /usr/local/bin/nsgmls:/usr/doc/en_US.ISO8859-1/articles/fb/article.sgml:175:27:E : element "DEVICENAME" undefined [ ... ] ...whereas not using a SystemLiteral with the DOCTYPE declaration works fine with nsgmls but xmllint refuses to parse the document. Am I wrong in concluding that by requiring a SystemLiteral for a document that is valid SGML, XML fails design goal #3, aka "XML shall be compatible with SGML"...? Anyway, using explicit SLs with xmllint gives me: 180-sec% xmllint article.sgml /usr/doc/share/sgml/freebsd.ent:26: parser error : Entity value required ^ /usr/doc/share/sgml/freebsd.ent:26: parser error : Space required before 'NDATA' ^ /usr/doc/share/sgml/freebsd.ent:26: parser error : xmlParseEntityDecl: entity rel.current not terminated ^ [ ... ] I can edit freebsd.ent to use the " syntax, or else remove the CDATA declaration entirely, which gives me: Entity: line 5: parser error : Entity 'trade' not defined designations have been followed by the or the ^ Entity: line 6: parser error : Entity 'reg' not defined ® symbol. ^ Entity: line 6: parser error : chunk is not well balanced ® symbol. ^ article.sgml:33: parser error : chunk is not well balanced &tm-attrib.general; ^ article.sgml:210: parser error : Entity 'prompt.root' not defined &prompt.root; sysctl net.link.ether.bridge.config=fxp0:0, ^ article.sgml:211: parser error : Entity 'prompt.root' not defined &prompt.root; sysctl net.link.ether.bridge.ipfw=1 ^ article.sgml:212: parser error : Entity 'prompt.root' not defined &prompt.root; sysctl net.link.ether.bridge.enable=1If you have &os; 5.1-RELEASE or previous the sysctl variables ^ > You'll want to generate a catalog that looks like this: [ ...thanks for the example, which I will investigate further... ] This has been interesting, but it's demonstrably non-trivial to convert SGML docbook articles into XML. More specificly, I don't see how to do so for a particular article without making non-local changes to .ent files being referenced by the article in order to make the XML version work at all, and I don't see how to make both nsgmls and xmllint happy at the same time. Are these conclusions valid, or I am wrong? :-) -- -Chuck PS: The problem I want to solve is simply that I want the DocBook system to output valid XHTML according to the W3C validator tool. I'm willing to accept that using xmllint on an XML source document to get XHTML content is probably more straightforward than using nsgmls+tidy on an SGML source document, but that's not very useful if the conversion to XML breaks existing SGML documents until they also are converted to XML...