Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 23 Jan 2012 12:39:25 -0700 (MST)
From:      Warren Block <wblock@wonkity.com>
To:        Gabor Kovesdan <gabor@FreeBSD.org>
Cc:        freebsd-doc@FreeBSD.org
Subject:   Re: Tidy and HTML tab spacing
Message-ID:  <alpine.BSF.2.00.1201231145380.90760@wonkity.com>
In-Reply-To: <4F1D93E0.2050709@FreeBSD.org>
References:  <alpine.BSF.2.00.1201181255210.39534@wonkity.com> <alpine.BSF.2.00.1201181520140.40712@wonkity.com> <4F1B4767.5070105@FreeBSD.org> <alpine.BSF.2.00.1201211648030.72083@wonkity.com> <4F1D93E0.2050709@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

---902635197-2098338272-1327347565=:90760
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed

On Mon, 23 Jan 2012, Gabor Kovesdan wrote:

> On 2012.01.22. 1:30, Warren Block wrote:
>> On Sun, 22 Jan 2012, Gabor Kovesdan wrote:
>> 
>>> On 2012.01.18. 23:49, Warren Block wrote:
>>>> 5. Don't tidy HTML files at all (suggested as an option by Benedict
>>>>    Reuschling).  The unprocessed HTML is ugly, but few people are going
>>>>    to look at it directly.  Files that haven't been through tidy are a
>>>>    little larger, about 4% in the case of the Porter's Handbook. 
>>> I also think tidy should be removed. As hrs wrote, new standards should be 
>>> evaluated and probably they are much better. (I think they are.) If there 
>>> are some nits, then we should process it with a custom script or 
>>> something, instead of this crapware.
>> 
>> Tidy does a lot; it would be a lot of work to recreate. 
> Tidy is also the reason that our webpages are not valid HTML.

A new version of Tidy is supposed to be out soonish.  Whether it will 
solve the problems, I don't know.

What about lxml?  Available in ports (devel/py-lxml), reputed to be good 
at parsing problem HTML and creating good XHTML.  A quick test showed 
that it seems to do okay with <pre> elements.

A quick script to generate a test is attached.  The W3C validator says 
this version of the Porter's Handbook has eight errors, versus the six 
errors and five warnings of the Tidy version.  (The ugly special-case in 
line 12 drops the lxml version to five errors.)
---902635197-2098338272-1327347565=:90760
Content-Type: TEXT/PLAIN; charset=US-ASCII; name=tester.py
Content-Transfer-Encoding: BASE64
Content-ID: <alpine.BSF.2.00.1201231239250.90760@wonkity.com>
Content-Description: 
Content-Disposition: attachment; filename=tester.py

IyEvdXNyL2Jpbi9lbnYgcHl0aG9uDQoNCmZyb20gbHhtbCBpbXBvcnQgZXRy
ZWUNCmltcG9ydCByZQ0KDQppbmh0bWwgPSBvcGVuKCdib29rLmh0bWwnLCAn
cicpLnJlYWQoKQ0KDQp0cmVlID0gZXRyZWUuSFRNTChpbmh0bWwucmVwbGFj
ZSgnXHInLCAnJykpDQpvdXR4aHRtbCA9ICdcbicuam9pbihbIGV0cmVlLnRv
c3RyaW5nKHN0cmVlLCBwcmV0dHlfcHJpbnQ9VHJ1ZSwgbWV0aG9kPSJ4bWwi
KQ0KCQlmb3Igc3RyZWUgaW4gdHJlZSBdKQ0KDQpvdXR4aHRtbCA9IG91dHho
dG1sLnJlcGxhY2UoJ2NvbXBhY3Q9IkNPTVBBQ1QiJywgJ2NvbXBhY3Q9ImNv
bXBhY3QiJykNCg0KZiA9IG9wZW4oJ2x4bWwuaHRtbCcsICd3JykNCmYud3Jp
dGUoJzwhRE9DVFlQRSBodG1sIFBVQkxJQyAiLS8vVzNDLy9EVEQgWEhUTUwg
MS4wIFRyYW5zaXRpb25hbC8vRU4iICJodHRwOi8vd3d3LnczLm9yZy9UUi94
aHRtbDEvRFREL3hodG1sMS10cmFuc2l0aW9uYWwuZHRkIj5cbicpDQpmLndy
aXRlKCc8aHRtbCB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMTk5OS94aHRt
bCI+XG4nKQ0KZi53cml0ZShvdXR4aHRtbCkNCmYud3JpdGUoJzwvaHRtbD5c
bicpDQpmLmNsb3NlKCkNCg==

---902635197-2098338272-1327347565=:90760--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.00.1201231145380.90760>