Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 19 Jan 2012 15:57:36 +0900 (JST)
From:      Hiroki Sato <hrs@FreeBSD.org>
To:        wblock@wonkity.com
Cc:        freebsd-doc@FreeBSD.org
Subject:   Re: Tidy and HTML tab spacing
Message-ID:  <20120119.155736.1127622096127250170.hrs@allbsd.org>
In-Reply-To: <alpine.BSF.2.00.1201181748230.42380@wonkity.com>
References:  <alpine.BSF.2.00.1201181520140.40712@wonkity.com> <20120119.084434.926306642968660094.hrs@allbsd.org> <alpine.BSF.2.00.1201181748230.42380@wonkity.com>

next in thread | previous in thread | raw e-mail | index | archive | help
----Security_Multipart(Thu_Jan_19_15_57_36_2012_397)--
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Warren Block <wblock@wonkity.com> wrote
  in <alpine.BSF.2.00.1201181748230.42380@wonkity.com>:

wb> > I think this will break the results because a newline just after ">"
wb> > is recognized as CDATA.
wb>
wb> A test run on the Porter's Handbook did seem to work:
wb>   make -C /usr/ports/www/tidy-devel deinstall
wb>   make clean book.html
wb>   perl -0777 -i -pe
wb>   's/CLASS="PROGRAMLISTING"\n\>/CLASS="PROGRAMLISTING"\>\n/g' book.html
wb>   make -C /usr/ports/www/tidy-devel install clean
wb>   tidy -wrap 90 -m -raw -preserve -f /dev/null -asxml  book.html

 Yes, but this just covers the issue because column calculation by
 Tidy is based on literals in the markup text, not on the result text.
 For example, in the following line

>[tab]foo

 Tidy expands [tab] to spaces based on the length of ">[tab]foo"
 regardless of the fact that ">" is not a character in the result
 text.  So, if we convert this into two line like the following:

>
[tab]foo

 the expansion of the [tab] will be correct.  However, this trick does
 not always work as intended.  One is that \n just after ">" means a
 newline in <pre>, not automatically ignored.  So, all of
 <programlisting> will have a empty line at the top.  Another is that
 this is valid only for a particular case.  For example:

<programlisting>foo[tab]<emphasis>bar</emphasis>[tab]baz
foo[tab]bar[tab]baz</programlisting>

 The <emphasis> will be converted to <span> in the HTML output and two
 lines of "foo bat baz" will not be aligned because Tidy counts the
 <span> tag for the tab expansion.  This cannot solve by converting
 "\n>" to ">\n" at the end of a <pre> tag in the HTML output.

 It is difficult to solve this issue completely because the result
 text can be obtained only by a complete HTML processor such as www
 browsers.  I don't have a good idea, but I think it is not a bad idea
 to use a tab character (or replacing it to &#09;) in the result text
 by modifying Tidy and leave the processing to www browsers.

-- Hiroki

----Security_Multipart(Thu_Jan_19_15_57_36_2012_397)--
Content-Type: application/pgp-signature
Content-Transfer-Encoding: 7bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (FreeBSD)

iEYEABECAAYFAk8XvuAACgkQTyzT2CeTzy2J0ACfQwN4NMOKea1HxWqGKyG4EYzB
locAoJltXwkFC83gR0yDFqnLjp1vgW4J
=9AE5
-----END PGP SIGNATURE-----

----Security_Multipart(Thu_Jan_19_15_57_36_2012_397)----



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120119.155736.1127622096127250170.hrs>