Date: Fri, 25 Jan 2008 10:09:18 -0800 From: "Murray Stokely" <murray@stokely.org> To: "=?ISO-8859-1?Q?G=E1bor_K=F6vesd=E1n?=" <gabor@freebsd.org> Cc: doceng@freebsd.org, freebsd-doc@freebsd.org Subject: Re: [PATCH] docproj port needs to use tidy-devel Message-ID: <2a7894eb0801251009w27463cd4n3f0fbbc9e62938cc@mail.gmail.com> In-Reply-To: <4799A266.2030900@FreeBSD.org> References: <2a7894eb0801162124x76d7132y8de9f4a1d314d8aa@mail.gmail.com> <4799A266.2030900@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 1/25/08, G=E1bor K=F6vesd=E1n <gabor@freebsd.org> wrote: > > First, sorry for the late answer. Not just the xhtml, but the html > output of tidy is incorrect as well, it does not validate. (I think > www/63552 is related, because without tidy, such errors don't appear.) > But, the newer tidy versions completely mess up character sets. They > mess the Hungarian characters set surely, but I suspect there are > others, too. The only reason that we don't disable it in the Hungarian > project is that builder has an ancient version, which works fine. > Besides, different versions of tidy have different set of command line > options, which makes our toolchain less portable. > But anyway, why we do really need tidy? I made some tests before without > tidy and the only thing that I had to do for generating valid pages was > to reinplace-edit the DTD. As sgmlnorm outputs our custom DTD, the > webpages were not valid, but after replacing them with HTML 4.1 > Transitional DTD, everything validated. I'd prefer see it go away. > Yes, I know that one reason for tidy is the indenting and line breaking > in HTML code, the output of sgmlnorm is not for human consumption. But > cannot we do that in a simpler way? xsltproc can output nice .html with line breaks and indentation. For example I use this for the RSS feeds to make it nice and human readable without going through tidy : <xsl:output method=3D"xml" indent=3D"yes"/> One more idea, which came to my mind about this. Currently, our webpages > are not uniform. We use HTML 4.1 for our pages generated from .sgml and > XHTML 1.1 for .xsl output. What do you think about using XHTML 1.1 > uniformly? Obviously, sgmlnorm cannot do that, but there are advantages Yea, that's a low priority could/should be done sort of item. I would focu= s first on any pages that actually don't validate or where you want to add some xml feature that can't currently be accomplished with the older sgml based pages. Updating old content / adding new content to the Handbook or something I think would be even more useful if you have the time. As a result, I think it would be a good idea. Maybe it would be a good > SoC project for me to polish the pages in this way as I'm interested, I > want to learn more XML stuff and I want to participate in the upcoming > SoC again. Another item would be to bring the doc repo to DocBook5 / XML. Web projects like this I think aren't the main intent of the summer of code program. We had one project in this area in 2005, and Emily did an excellent job with it writing a LOT of xslt code for us and completely redesigning the web site, but converting the remaining sgml to xml isn't really a good fit with the summer of code program. But by all means, please do convert any individual SGML files to XML if tha= t is where your interests lay. - Murray
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2a7894eb0801251009w27463cd4n3f0fbbc9e62938cc>