Date: Mon, 24 Mar 2003 04:07:45 +0200 From: Giorgos Keramidas <keramida@ceid.upatras.gr> To: Jeroen Ruigrok/asmodai <asmodai@wxs.nl> Cc: freebsd-doc@FreeBSD.ORG Subject: Re: docs/50211: [PATCH] Fix textfile creation Message-ID: <20030324020745.GA22656@gothmog.gr> In-Reply-To: <200303231710.h2NHAGEb024196@freefall.freebsd.org> References: <200303231710.h2NHAGEb024196@freefall.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2003-03-23 09:10, Jeroen Ruigrok/asmodai <asmodai@wxs.nl> wrote: >-On [20030323 15:07], Ceri Davies (ceri@FreeBSD.org) wrote: >> We discussed this on -doc a month or so ago, and were generally thinking of >> going back to www/lynx, because this also gets localized text builds working. > > Problem I had with lynx was that I was unable to make it parse > book.html-tex as text/html. > w3m has a -T flag for this, elinks just looks at the file itself, or > perhaps just assumes it is HTML. > > >Would you happen to know if elinks has this advantage too ? > > It does, but I don't know for certain for which languages it all works: > > elinks -dump -dump-charset iso-8859-15 http://www.paris.fr/ > > gives me accent aigus, accent circumflexes, etc. > > I would be interested in hearing about non-Latin-based examples and how > they work out. Fetching... I'll try it with a Greek document in a while. giorgos@gothmog[03:57]/tmp$ elinks -dump-charset ISO-8859-7 -dump 1 lala.html AAeec,ieeue eaass`iaaii. giorgos@gothmog[03:57]/tmp$ grep charset lala.html <meta name="http-equiv" content="text/html; charset="ISO-8859-7"> Hrmf... Doesn't quite work. At least, it doesn't work without tweaking the ~/.elinks files and stuff. This is bad, because we can't use elinks in batch mod conversion of many different languages and charsets without first configuring it through the curses interface. There is an -eval command line option that should probably work fine with non ISO-8859-1 texts, when used as: elinks -eval 'set document.codepage.assume = "ISO-8859-7"' \ -eval 'set terminal.vt220.charset = "ISO-8859-7"' \ -dump 1 lala.html but I can't seen to find any good way of making this output raw 8-bit text for Greek :( And I even have my locale set up for Greek: giorgos@gothmog[04:06]/tmp$ env | grep LC LC_COLLATE=el_GR.ISO8859-7 LC_CTYPE=el_GR.ISO8859-7 - Giorgos To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-doc" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030324020745.GA22656>