Date: Sat, 21 Apr 2012 10:12:56 +0200 From: Matthias Apitz <guru@unixarea.de> To: Matthew Seaman <matthew@freebsd.org> Cc: freebsd-questions@freebsd.org Subject: Re: converting UTF-8 to HTML Message-ID: <20120421081256.GA8769@tinyCurrent> In-Reply-To: <4F925504.4090001@FreeBSD.org> References: <20120421055823.GA6788@tinyCurrent> <4F925504.4090001@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
El día Saturday, April 21, 2012 a las 07:34:44AM +0100, Matthew Seaman escribió:
> www/tidy-devel
>
> (which is effectively a fork of the original www/tidy project, and has
> quite a lot of new functionality)
>
> If you specify 'ascii' for the output format, it should generate
> appropriate character escapes.
Thanks; it works fine if one specifies utf8 for input and ascii for
output in a config file .tidy like:
$ cat .tidy
output-xhtml: yes
add-xml-decl: no
doctype: strict
input-encoding: utf8
output-encoding: ascii
indent: auto
wrap: 76
repeated-attributes: keep-last
error-file: errs.txt
Then you can run and get valid ASCII HTML style, for example:
$ echo 'ΜΙΣΟ ΛΙΤΡΟ ΑΘΩΣ ΚΟΚΚΙΝΟ ΠΑΡΑΚΑΛΩ' | tidy -config .tidy
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="generator" content=
"HTML Tidy for FreeBSD (vers 7 December 2008), see www.w3.org" />
<title></title>
</head>
<body>
ΜΙΣΟ ΛΙΤΡΟ
ΑΘΩΣ
ΚΟΚΚΙΝΟ
ΠΑΡΑΚΑΛΩ
</body>
</html>
This is exactly what I was looking for. Thanks
matthias
--
Matthias Apitz
t +49-89-61308 351 - f +49-89-61308 399 - m +49-170-4527211
e <guru@unixarea.de> - w http://www.unixarea.de/
UNIX since V7 on PDP-11 | UNIX on mainframe since ESER 1055 (IBM /370)
UNIX on x86 since SVR4.2 UnixWare 2.1.2 | FreeBSD since 2.2.5
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120421081256.GA8769>
