From owner-freebsd-questions@FreeBSD.ORG Fri May 26 16:45:16 2006 Return-Path: X-Original-To: questions@freebsd.org Delivered-To: freebsd-questions@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D4CEA16A462 for ; Fri, 26 May 2006 16:45:16 +0000 (UTC) (envelope-from wingot@eftel.com) Received: from tara2.wa.amnet.net.au (tara2.wa.amnet.net.au [203.161.126.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id E8C0D43D55 for ; Fri, 26 May 2006 16:45:13 +0000 (GMT) (envelope-from wingot@eftel.com) Received: (qmail 25800 invoked by uid 89); 26 May 2006 16:45:12 -0000 Received: by simscan 1.1.0 ppid: 25791, pid: 25792, t: 1.5048s scanners: attach: 1.1.0 clamav: 0.88/m:36/d:1310 spam: 3.1.0 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on tara2.wa.amnet.net.au X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00 autolearn=ham version=3.1.0 Received: from unknown (HELO ?192.168.1.10?) (203.161.72.123) by tara2.wa.amnet.net.au with SMTP for ; 26 May 2006 16:45:10 -0000 X-Envelope-To: questions@freebsd.org Message-ID: <44773030.6030309@eftel.com> Date: Sat, 27 May 2006 00:43:28 +0800 From: Adrian Pavone User-Agent: Thunderbird 1.5.0.2 (X11/20060525) MIME-Version: 1.0 To: questions@freebsd.org References: <7.0.1.0.2.20060526180029.0224b418@broadpark.no> In-Reply-To: <7.0.1.0.2.20060526180029.0224b418@broadpark.no> X-Enigmail-Version: 0.94.0.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Subject: Re: textproc: Typesetting holy content X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 May 2006 16:45:19 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Kyrre Nygard wrote: > > Hello! > > I hope this is not too off topic. > > I'm involved in some studies here, on the authority of holy scriptures. > > I am trying to transcribe The Noble Qur'an, by some said to be the most > elegant book > ever written, into LaTeX format. That way I can format it the way I wish, > and study it at my own premises. > > I began to wget -m http://www.usc.edu/dept/MSA/quran/ > > Which gave me 001.qmt.html all the way up to 114.qmt.html. > > Next, I ran this: > > for i in `find -s . -name "*.html"`; do w3m -dump "$i" > > "${i%.html}.txt"; echo "${i%.html}.txt"; done > > And ended up with 001.qmt.txt all the way up to 114.qmt.txt. > > Then, I took 001.qmt.txt, which looked like this: > > -- > > USC > USC > Compendium of Muslim Texts > > Fundamentals > Allah > Muhammad > Qur'an > Sunnah > Pillars > > Special Topics > Economics > History > Human Relations > Law > Misconceptions About Islam > Politics > > Tools > Qur'an Search > Hadeeth Search > Glossary > > Translations of the Qur'an, Chapter 1: > > AL-FATIHA (THE OPENING) > > Total Verses: 7 > Revealed At: MAKKA > Maududi's introduction > > ------------------------------------------------------------------------------- > > > 001.001 > YUSUFALI: In the name of Allah, Most Gracious, Most Merciful. > PICKTHAL: In the name of Allah, the Beneficent, the Merciful. > SHAKIR: In the name of Allah, the Beneficent, the Merciful. > > 001.002 > YUSUFALI: Praise be to Allah, the Cherisher and Sustainer of the worlds; > PICKTHAL: Praise be to Allah, Lord of the Worlds, > SHAKIR: All praise is due to Allah, the Lord of the Worlds. > > 001.003 > YUSUFALI: Most Gracious, Most Merciful; > PICKTHAL: The Beneficent, the Merciful. > SHAKIR: The Beneficent, the Merciful. > > 001.004 > YUSUFALI: Master of the Day of Judgment. > PICKTHAL: Master of the Day of Judgment, > SHAKIR: Master of the Day of Judgment. > > 001.005 > YUSUFALI: Thee do we worship, and Thine aid we seek. > PICKTHAL: Thee (alone) we worship; Thee (alone) we ask for help. > SHAKIR: Thee do we serve and Thee do we beseech for help. > > 001.006 > YUSUFALI: Show us the straight way, > PICKTHAL: Show us the straight path, > SHAKIR: Keep us on the right path. > > 001.007 > YUSUFALI: The way of those on whom Thou hast bestowed Thy Grace, those > whose > (portion) is not wrath, and who go not astray. > PICKTHAL: The path of those whom Thou hast favoured; Not the (path) of > those > who earn Thine anger nor of those who go astray. > SHAKIR: The path of those upon whom Thou hast bestowed favors. Not (the > path) > of those upon whom Thy wrath is brought down, nor of those who go astray. > > Sponsored by the MSA. > > -- > > And transformed it into LaTeX format: > > -- > > \documentclass[11pt,a4paper,oneside,english]{book} > \begin{document} > > \title{The Noble Qur'an} > > \tableofcontents{} > > \chapter{AL-FATIHA (THE OPENING)} > > 001.001 In the name of Allah, Most Gracious, Most Merciful. > 001.002 Praise be to Allah, the Cherisher and Sustainer of the worlds; > 001.003 Most Gracious, Most Merciful; > 001.004 Master of the Day of Judgment. > 001.005 Thee do we worship, and Thine aid we seek. > 001.006 Show us the straight way, > 001.007 The way of those on whom Thou hast bestowed Thy Grace, those > whose (portion) is not wrath, and who go not astray. > > -- > > Basically what I did manually on the first file is what I intend to do > automatically > with all the other files. The format remains the same, however the > quantity of text will differ. > > The process, to be done on each of my now *.txt files, would look > something like this: > > 1 > Cut out everything before line 27. > > 2 > Take line 27, and embody it. So if line 27 says "HELLO", it will become: > > \chapter{HELLO} > > 3 > Cut out everything preceding line 27 until a NNN.NNN (verse indication) > appears. > > 4 > Join the NNN.NNN with the below line and cut out "YUSUFALI:" > > 5 > Join all lines below the "YUSUFALI:" line ... > > 6 > Until the "PICKTHAL:" line appears. Then, delete it and all below lines > until the next NNN.NNN appears. > > The reason is that the University of California compilation displays > three different english translations and I'd only be interested in the > first one. > > For instance, this: > > -- > > 004.054 > YUSUFALI: Or do they envy mankind for what Allah hath given them of his > bounty? > but We had already given the people of Abraham the Book and Wisdom, and > conferred upon them a great kingdom. > PICKTHAL: Or are they jealous of mankind because of that which Allah of His > bounty hath bestowed upon them? For We bestowed upon the house of > Abraham (of > old) the Scripture and wisdom, and We bestowed on them a mighty kingdom. > SHAKIR: Or do they envy the people for what Allah has given them of His > grace? > But indeed We have given to Ibrahim's children the Book and the wisdom, > and We > have given them a grand kingdom. > > -- > > Would simply become this, in one line: > > -- > > 004.054 Or do they envy mankind for what Allah hath given them of his > bounty? but We had already given the people of Abraham the Book and > Wisdom, and conferred upon them a great kingdom. > > -- > > 7 > When the next NNN.NNN appears, treat it like the rest. > > Thank you! Really! For bearing with me so far! > > Indeed, this is what I wish to achieve. > > I realize now though, after writing all this down, that it might be too > much for some. > I hope that is not the case with he who has been endowed with the > ability to help me. > > All the best, > Kyrre > > _______________________________________________ > freebsd-questions@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to > "freebsd-questions-unsubscribe@freebsd.org" > > Well, sounds to me like the perfect reason to learn how to write a shell script. You already have your algorithm/method clearly defined, now you just need something to automate it. A shell script would clearly be the thing to do that with. If you need any help with shells and shell scripting, a wealth of information is just a google away ;). Just my $0.02 Regards, Adrian - -- This email address has expired. Please contact me for my new address. Please obtain my pgp public key from pgp.mit.edu before sending me any private mail, otherwise your email will likely be filtered incorrectly and possibly junked. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFEdzAw0JHtFv5fxW8RAiikAJ44H+mNLm6bk1409G+3PAcdjnSt9QCfZGDB +zKOQkEvU3qi4jSKaohyk/4= =guVJ -----END PGP SIGNATURE-----