Date: Fri, 25 Jan 2008 09:48:38 +0100 From: =?ISO-8859-1?Q?G=E1bor_K=F6vesd=E1n?= <gabor@FreeBSD.org> To: Murray Stokely <murray@stokely.org> Cc: doceng@freebsd.org, freebsd-doc@freebsd.org Subject: Re: [PATCH] docproj port needs to use tidy-devel Message-ID: <4799A266.2030900@FreeBSD.org> In-Reply-To: <2a7894eb0801162124x76d7132y8de9f4a1d314d8aa@mail.gmail.com> References: <2a7894eb0801162124x76d7132y8de9f4a1d314d8aa@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Murray Stokely escribió: > Is there any reason not to update the docproj port to use tidy-devel rather > than tidy? The released version of tidy is nearly 8 years old and produces > xhtml that doesn't validate. The newer -devel releases produce more correct > xhtml. > First, sorry for the late answer. Not just the xhtml, but the html output of tidy is incorrect as well, it does not validate. (I think www/63552 is related, because without tidy, such errors don't appear.) But, the newer tidy versions completely mess up character sets. They mess the Hungarian characters set surely, but I suspect there are others, too. The only reason that we don't disable it in the Hungarian project is that builder has an ancient version, which works fine. Besides, different versions of tidy have different set of command line options, which makes our toolchain less portable. But anyway, why we do really need tidy? I made some tests before without tidy and the only thing that I had to do for generating valid pages was to reinplace-edit the DTD. As sgmlnorm outputs our custom DTD, the webpages were not valid, but after replacing them with HTML 4.1 Transitional DTD, everything validated. I'd prefer see it go away. Yes, I know that one reason for tidy is the indenting and line breaking in HTML code, the output of sgmlnorm is not for human consumption. But cannot we do that in a simpler way? One more idea, which came to my mind about this. Currently, our webpages are not uniform. We use HTML 4.1 for our pages generated from .sgml and XHTML 1.1 for .xsl output. What do you think about using XHTML 1.1 uniformly? Obviously, sgmlnorm cannot do that, but there are advantages in using XML-based technologies. Well, I'm just an enthusiastic newbie about XML, but I think it would make the data-sharing between our pages easier. Plus, we can make our infrastructure more simple as we would only need the XML tools for building webpages and one DTD, no more conditional cases in .ent files, like this one in header.ent: <![ %xml.features; [ <!ENTITY header1.meta ' <meta http-equiv="Content-Type" content="text/html; charset=&xml.encoding;" /> <meta name="MSSmartTagsPreventParsing" content="TRUE" /> '> ]]> <!ENTITY header1.meta ' <meta http-equiv="Content-Type" content="text/html; charset=&xml.encoding;"> <meta name="MSSmartTagsPreventParsing" content="TRUE"> '> Also, XHTML is easier to validate, more strict yet not more difficult to edit. It is also supposed to obsolete HTML, (yet with the draft of HTML5 it is not that sure any more, but this has nothing to do with the topic and its advantages) and it is a newer standard to conform to. As a result, I think it would be a good idea. Maybe it would be a good SoC project for me to polish the pages in this way as I'm interested, I want to learn more XML stuff and I want to participate in the upcoming SoC again. Another item would be to bring the doc repo to DocBook5 / XML. If this whole stuff about XML had been discussed before, forgive me please, I missed that. Regards, -- Gabor Kovesdan EMAIL: gabor@FreeBSD.org WWW: http://www.kovesdan.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4799A266.2030900>