From owner-freebsd-vuxml@FreeBSD.ORG  Tue Aug 31 01:25:41 2004
Return-Path: <owner-freebsd-vuxml@FreeBSD.ORG>
Delivered-To: freebsd-vuxml@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 6A41916A4CE
	for <freebsd-vuxml@freebsd.org>; Tue, 31 Aug 2004 01:25:41 +0000 (GMT)
Received: from gw.celabo.org (gw.celabo.org [208.42.49.153])
	by mx1.FreeBSD.org (Postfix) with ESMTP id DEAEF43D45
	for <freebsd-vuxml@freebsd.org>; Tue, 31 Aug 2004 01:25:40 +0000 (GMT)
	(envelope-from nectar@FreeBSD.org)
Received: from localhost (localhost [127.0.0.1])
	by gw.celabo.org (Postfix) with ESMTP
	id 685E65487F; Mon, 30 Aug 2004 20:25:40 -0500 (CDT)
Received: from gw.celabo.org ([127.0.0.1])
 by localhost (hellblazer.celabo.org [127.0.0.1]) (amavisd-new, port 10024)
 with SMTP id 09437-02; Mon, 30 Aug 2004 20:25:29 -0500 (CDT)
Received: from [10.0.1.107] (lum.celabo.org [10.0.1.107])
	(using TLSv1 with cipher RC4-SHA (128/128 bits))
	(Client did not present a certificate)
	by gw.celabo.org (Postfix) with ESMTP
	id ECCC354861; Mon, 30 Aug 2004 20:25:28 -0500 (CDT)
In-Reply-To: <7mk6vg2m15.wl@black.imgsrc.co.jp>
References: <20040830133416.X35009@xeon.unixathome.org>
	<CDBCA21A-FAE2-11D8-A99F-000A95BC6FAE@celabo.org>
	<7mk6vg2m15.wl@black.imgsrc.co.jp>
Mime-Version: 1.0 (Apple Message framework v619)
Content-Type: text/plain; charset=US-ASCII; format=flowed
Message-Id: <9E499E76-FAEC-11D8-84D2-000A95BC6FAE@FreeBSD.org>
Content-Transfer-Encoding: 7bit
From: Jacques Vidrine <nectar@FreeBSD.org>
Date: Mon, 30 Aug 2004 20:25:18 -0500
To: Jun Kuriyama <kuriyama@imgsrc.co.jp>
X-Mailer: Apple Mail (2.619)
cc: freebsd-vuxml@freebsd.org
Subject: Re: vuln.xml *is* XML (was Re: vuln.xml is not XML)
X-BeenThere: freebsd-vuxml@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Documenting security issues in VuXML <freebsd-vuxml.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-vuxml>,
	<mailto:freebsd-vuxml-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-vuxml>
List-Post: <mailto:freebsd-vuxml@freebsd.org>
List-Help: <mailto:freebsd-vuxml-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-vuxml>,
	<mailto:freebsd-vuxml-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 31 Aug 2004 01:25:41 -0000


On Aug 30, 2004, at 7:23 PM, Jun Kuriyama wrote:
> Both are correct.  In good old XML world, we should use CDATA section
> to quote external markup.  On the other hand, VuXML lives in XML +
> Namespace world (see related recommendations).

If you want to quote external markup as *text*, then sure: CDATA is one 
way of doing that (character and entity references are another).  In 
this case, it is just *text*, not markup--- it looks like markup but it 
isn't as far as XML processors are concerned.  But if you want to do 
something with that markup (e.g. validation, XSLT) then you really must 
use real XML and namespaces.

I guess you are probably bringing this up from the perspective of 
DocBook, but it just happens that DocBook--- and some other XML 
applications such as XML-RPC and RSS--- was born before namespaces and 
has not adopted support (yet).  So we're left with the CDATA workaround 
that we had to use with SGML.  This should never be done in new XML 
applications.  This is finally being addressed in some versions of 
DocBook (e.g. DocBook 4.3 + SVG).

>> I saw your earlier message about XML::Node, but since I am not 
>> familiar
>> with that (or XML::Parser), I did not understand what problem you were
>> having.  Could you try to describe it differently?
>
> I'm not sure XML::Parser can handle namespace correctly.  If it cannot
> do such, parser will confuse when it reads markups with namespace.

I don't believe that is correct.  Tools that do not grok namespaces 
will just not see the namespaces.  They will still parse the content 
just fine.  Since we use default namespace declarations by convention 
in vuln.xml, it is particularly un-obtrusive:  a parser will just see 
"xmlns" attribute nodes, but otherwise continue just fine.

Basically, a namespace-aware processor will see events like these:

     start element (http://www.vuxml.org/app/vuxml-1/, description)
           attributes []
     start element (http://www.w3.org/1999/xhtml, body)
           attributes []
     start element (http://www.w3.org/1999/xhtml, blockquote)
           attributes [(cite, "http://...")]
       ...
     end element (http://www.w3.org/1999/xhtml, blockquote)
     end element (http://www.w3.org/1999/xhtml, body)
     end element (http://www.vuxml.org/app/vuxml-1/, description)

while an old XML processor with no support for namespaces will see 
events like these:

     start element description
           attributes []
     start element body
           attributes [(xmlns, "http://www.w3.org/1999/xhtml")]
     start element blockquote
           attributes [(cite, "http://...")]
       ...
     end element blockquote
     end element body
     end element description

You can even ignore the namespaces if you like.  You just need to 
"remember" when you are processing stuff inside a <description> element 
versus not.

AFAIK, XML::Node is based on XML::Parser which is based on expat.  
expat supports namespaces perfectly well, so it is surprising if the 
Perl modules built on top of it do not.

Cheers,
-- 
Jacques A Vidrine / NTT/Verio
nectar@celabo.org / jvidrine@verio.net / nectar@freebsd.org