From owner-freebsd-doc Thu Sep 12 17:25:46 1996 Return-Path: owner-doc Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id RAA15200 for doc-outgoing; Thu, 12 Sep 1996 17:25:46 -0700 (PDT) Received: from fallout.campusview.indiana.edu (fallout.campusview.indiana.edu [149.159.1.1]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id RAA15193 for ; Thu, 12 Sep 1996 17:25:43 -0700 (PDT) Received: from localhost (jfieber@localhost) by fallout.campusview.indiana.edu (8.7.5/8.7.3) with SMTP id TAA01454; Thu, 12 Sep 1996 19:25:40 -0500 (EST) Date: Thu, 12 Sep 1996 19:25:40 -0500 (EST) From: John Fieber To: Darren Davis cc: doc@freebsd.org Subject: Re: SGML and HTML grammar verification. In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-doc@freebsd.org X-Loop: FreeBSD.org Precedence: bulk On Thu, 12 Sep 1996, Darren Davis wrote: > What tools do you use to validate your SGML or HTML grammar (pages)? I > see that there is weblint in the ports area, but it does not seem as strong > as say the validator from University of Utah (ftp://ftp.math.utah.edu/pub/sgml). > I have seen some tools out on the net that will peruse an HTML web structure > and validate the grammar as well as the links. Any thoughts? The One True Way to validate the markup of a web page (or any SGML) is with a validating SGML parser and a DTD. FreeBSD includes the parser (sgmls) and HTML DTDs are available from http://www.w3.org/pub/WWW/MarkUp/. HOWEVER, just because a page passes validation, doesn't mean a browser will be happy with it. In particular, SGML (and HTML) offer a bunch of markup minimization techniques that are technically correct, but will confuse most browsers. I've found various forms of attribute minimization to be particularly problematic. A good way to combat this problem is with an SGML normalizer which expands all markup minimization to its fully qualified form. For example, <HEAD> <TITLE>This is the title

This is the title

I assure you that many more browsers will correctly render the normalized version than the un-normalized, even though both are structurally identical! James Clark's excellent SP package (http://www.jclark.com/) includes an SGML normalizer. For validating links, MOMspider is pretty good. I don't have a URL handy though. -john == jfieber@indiana.edu =========================================== == http://fallout.campusview.indiana.edu/~jfieber ================