From owner-freebsd-doc@FreeBSD.ORG Thu Jan 19 06:59:31 2012 Return-Path: Delivered-To: freebsd-doc@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E7D27106564A for ; Thu, 19 Jan 2012 06:59:30 +0000 (UTC) (envelope-from hrs@FreeBSD.org) Received: from mail.allbsd.org (gatekeeper-int.allbsd.org [IPv6:2001:2f0:104:e002::2]) by mx1.freebsd.org (Postfix) with ESMTP id 9CBFC8FC13 for ; Thu, 19 Jan 2012 06:59:29 +0000 (UTC) Received: from alph.allbsd.org ([IPv6:2001:2f0:104:e010:862b:2bff:febc:8956]) (authenticated bits=128) by mail.allbsd.org (8.14.4/8.14.4) with ESMTP id q0J6xF2M044576; Thu, 19 Jan 2012 15:59:25 +0900 (JST) (envelope-from hrs@FreeBSD.org) Received: from localhost (localhost [IPv6:::1]) (authenticated bits=0) by alph.allbsd.org (8.14.4/8.14.4) with ESMTP id q0J6xDMv070988; Thu, 19 Jan 2012 15:59:15 +0900 (JST) (envelope-from hrs@FreeBSD.org) Date: Thu, 19 Jan 2012 15:57:36 +0900 (JST) Message-Id: <20120119.155736.1127622096127250170.hrs@allbsd.org> To: wblock@wonkity.com From: Hiroki Sato In-Reply-To: References: <20120119.084434.926306642968660094.hrs@allbsd.org> X-PGPkey-fingerprint: BDB3 443F A5DD B3D0 A530 FFD7 4F2C D3D8 2793 CF2D X-Mailer: Mew version 6.3.51 on Emacs 23.3 / Mule 6.0 (HANACHIRUSATO) Mime-Version: 1.0 Content-Type: Multipart/Signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="--Security_Multipart(Thu_Jan_19_15_57_36_2012_397)--" Content-Transfer-Encoding: 7bit X-Virus-Scanned: clamav-milter 0.97 at gatekeeper.allbsd.org X-Virus-Status: Clean X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.3 (mail.allbsd.org [IPv6:2001:2f0:104:e001::32]); Thu, 19 Jan 2012 15:59:27 +0900 (JST) X-Spam-Status: No, score=-104.4 required=13.0 tests=BAYES_00, CONTENT_TYPE_PRESENT, QENCPTR1, RDNS_NONE, SPF_SOFTFAIL, USER_IN_WHITELIST autolearn=no version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on gatekeeper.allbsd.org Cc: freebsd-doc@FreeBSD.org Subject: Re: Tidy and HTML tab spacing X-BeenThere: freebsd-doc@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Documentation project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Jan 2012 06:59:31 -0000 ----Security_Multipart(Thu_Jan_19_15_57_36_2012_397)-- Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Warren Block wrote in : wb> > I think this will break the results because a newline just after ">" wb> > is recognized as CDATA. wb> wb> A test run on the Porter's Handbook did seem to work: wb> make -C /usr/ports/www/tidy-devel deinstall wb> make clean book.html wb> perl -0777 -i -pe wb> 's/CLASS="PROGRAMLISTING"\n\>/CLASS="PROGRAMLISTING"\>\n/g' book.html wb> make -C /usr/ports/www/tidy-devel install clean wb> tidy -wrap 90 -m -raw -preserve -f /dev/null -asxml book.html Yes, but this just covers the issue because column calculation by Tidy is based on literals in the markup text, not on the result text. For example, in the following line >[tab]foo Tidy expands [tab] to spaces based on the length of ">[tab]foo" regardless of the fact that ">" is not a character in the result text. So, if we convert this into two line like the following: > [tab]foo the expansion of the [tab] will be correct. However, this trick does not always work as intended. One is that \n just after ">" means a newline in
, not automatically ignored.  So, all of
  will have a empty line at the top.  Another is that
 this is valid only for a particular case.  For example:

foo[tab]bar[tab]baz
foo[tab]bar[tab]baz

 The  will be converted to  in the HTML output and two
 lines of "foo bat baz" will not be aligned because Tidy counts the
  tag for the tab expansion.  This cannot solve by converting
 "\n>" to ">\n" at the end of a 
 tag in the HTML output.

 It is difficult to solve this issue completely because the result
 text can be obtained only by a complete HTML processor such as www
 browsers.  I don't have a good idea, but I think it is not a bad idea
 to use a tab character (or replacing it to 	) in the result text
 by modifying Tidy and leave the processing to www browsers.

-- Hiroki

----Security_Multipart(Thu_Jan_19_15_57_36_2012_397)--
Content-Type: application/pgp-signature
Content-Transfer-Encoding: 7bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (FreeBSD)

iEYEABECAAYFAk8XvuAACgkQTyzT2CeTzy2J0ACfQwN4NMOKea1HxWqGKyG4EYzB
locAoJltXwkFC83gR0yDFqnLjp1vgW4J
=9AE5
-----END PGP SIGNATURE-----

----Security_Multipart(Thu_Jan_19_15_57_36_2012_397)----