Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 12 Sep 1996 19:25:40 -0500 (EST)
From:      John Fieber <jfieber@indiana.edu>
To:        Darren Davis <darrend@novell.com>
Cc:        doc@freebsd.org
Subject:   Re: SGML and HTML grammar verification.
Message-ID:  <Pine.BSI.3.95.960912190900.1328C-100000@fallout.campusview.indiana.edu>
In-Reply-To: <s2383d73.007@novell.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 12 Sep 1996, Darren Davis wrote:

> What tools do you use to validate your SGML or HTML grammar (pages)?  I
> see that there is weblint in the ports area, but it does not  seem as strong
> as say the validator from University of Utah (ftp://ftp.math.utah.edu/pub/sgml).
>  I have seen some tools out on the net that will peruse an HTML web structure
> and validate the grammar as well as the links.  Any thoughts?

The One True Way to validate the markup of a web page (or any
SGML) is with a validating SGML parser and a DTD.  FreeBSD
includes the parser (sgmls) and HTML DTDs are available from
http://www.w3.org/pub/WWW/MarkUp/. 

HOWEVER, just because a page passes validation, doesn't mean a
browser will be happy with it.  In particular, SGML (and HTML)
offer a bunch of markup minimization techniques that are
technically correct, but will confuse most browsers.  I've found
various forms of attribute minimization to be particularly
problematic. 

A good way to combat this problem is with an SGML normalizer
which expands all markup minimization to its fully qualified
form.  For example,

 <title/This is the title/
 <h1 center/This is the title/

would come out of the normalizer like:

 <HTML>
 <HEAD>
 <TITLE>This is the title</TITLE>
 </HEAD>
 <BODY><H1 ALIGN="CENTER">This is the title</H1></BODY>
 </HTML>

I assure you that many more browsers will correctly render the
normalized version than the un-normalized, even though both
are structurally identical!

James Clark's excellent SP package (http://www.jclark.com/) 
includes an SGML normalizer.

For validating links, MOMspider is pretty good.  I don't have a
URL handy though.

-john

== jfieber@indiana.edu ===========================================
== http://fallout.campusview.indiana.edu/~jfieber ================




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSI.3.95.960912190900.1328C-100000>