Date: Mon, 30 Aug 2004 20:25:18 -0500 From: Jacques Vidrine <nectar@FreeBSD.org> To: Jun Kuriyama <kuriyama@imgsrc.co.jp> Cc: freebsd-vuxml@freebsd.org Subject: Re: vuln.xml *is* XML (was Re: vuln.xml is not XML) Message-ID: <9E499E76-FAEC-11D8-84D2-000A95BC6FAE@FreeBSD.org> In-Reply-To: <7mk6vg2m15.wl@black.imgsrc.co.jp> References: <20040830133416.X35009@xeon.unixathome.org> <CDBCA21A-FAE2-11D8-A99F-000A95BC6FAE@celabo.org> <7mk6vg2m15.wl@black.imgsrc.co.jp>
next in thread | previous in thread | raw e-mail | index | archive | help
On Aug 30, 2004, at 7:23 PM, Jun Kuriyama wrote: > Both are correct. In good old XML world, we should use CDATA section > to quote external markup. On the other hand, VuXML lives in XML + > Namespace world (see related recommendations). If you want to quote external markup as *text*, then sure: CDATA is one way of doing that (character and entity references are another). In this case, it is just *text*, not markup--- it looks like markup but it isn't as far as XML processors are concerned. But if you want to do something with that markup (e.g. validation, XSLT) then you really must use real XML and namespaces. I guess you are probably bringing this up from the perspective of DocBook, but it just happens that DocBook--- and some other XML applications such as XML-RPC and RSS--- was born before namespaces and has not adopted support (yet). So we're left with the CDATA workaround that we had to use with SGML. This should never be done in new XML applications. This is finally being addressed in some versions of DocBook (e.g. DocBook 4.3 + SVG). >> I saw your earlier message about XML::Node, but since I am not >> familiar >> with that (or XML::Parser), I did not understand what problem you were >> having. Could you try to describe it differently? > > I'm not sure XML::Parser can handle namespace correctly. If it cannot > do such, parser will confuse when it reads markups with namespace. I don't believe that is correct. Tools that do not grok namespaces will just not see the namespaces. They will still parse the content just fine. Since we use default namespace declarations by convention in vuln.xml, it is particularly un-obtrusive: a parser will just see "xmlns" attribute nodes, but otherwise continue just fine. Basically, a namespace-aware processor will see events like these: start element (http://www.vuxml.org/app/vuxml-1/, description) attributes [] start element (http://www.w3.org/1999/xhtml, body) attributes [] start element (http://www.w3.org/1999/xhtml, blockquote) attributes [(cite, "http://...")] ... end element (http://www.w3.org/1999/xhtml, blockquote) end element (http://www.w3.org/1999/xhtml, body) end element (http://www.vuxml.org/app/vuxml-1/, description) while an old XML processor with no support for namespaces will see events like these: start element description attributes [] start element body attributes [(xmlns, "http://www.w3.org/1999/xhtml")] start element blockquote attributes [(cite, "http://...")] ... end element blockquote end element body end element description You can even ignore the namespaces if you like. You just need to "remember" when you are processing stuff inside a <description> element versus not. AFAIK, XML::Node is based on XML::Parser which is based on expat. expat supports namespaces perfectly well, so it is surprising if the Perl modules built on top of it do not. Cheers, -- Jacques A Vidrine / NTT/Verio nectar@celabo.org / jvidrine@verio.net / nectar@freebsd.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9E499E76-FAEC-11D8-84D2-000A95BC6FAE>