From owner-freebsd-hackers Sat Feb 15 14:11:26 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id OAA15862 for hackers-outgoing; Sat, 15 Feb 1997 14:11:26 -0800 (PST) Received: from news.IAEhv.nl (root@news.IAEhv.nl [194.151.64.4]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id OAA15838 for ; Sat, 15 Feb 1997 14:11:16 -0800 (PST) Received: from LOCAL (uucp@localhost) by news.IAEhv.nl (8.6.13/1.63) with IAEhv.nl; pid 29248 on Sat, 15 Feb 1997 23:11:07 +0100; id XAA29248 efrom: peter@grendel.IAEhv.nl; eto: hackers@FreeBSD.ORG Received: (from peter@localhost) by grendel.IAEhv.nl (8.8.4/8.8.4) id WAA00487; Sat, 15 Feb 1997 22:43:20 +0100 (MET) Message-ID: Date: Sat, 15 Feb 1997 22:43:20 +0100 From: peter@grendel.IAEhv.nl (Peter Korsten) To: hackers@FreeBSD.ORG Subject: Re: MIME applications for FreeBSD References: <199702130839.TAA00435@freebsd1.cimlogic.com.au> <199702121715.KAA00715@phaeton.artisoft.com> X-Mailer: Mutt 0.58-PL15 Mime-Version: 1.0 In-Reply-To: <199702130839.TAA00435@freebsd1.cimlogic.com.au>; from John Birrell on Feb 13, 1997 19:38:59 +1100 Sender: owner-hackers@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk John Birrell shared with us: > > We are prevented from reverse engineering by the licence for msword > (I guess, since other MS products have that clause). MS is unlikely > to publicly document Word file format. Actually, someone (a student?) at the Technical University of Berlin has described the general Microsoft OLE file format. He states that this he isn't sure about the format and all variables, therefore he calls it the "LAOLA" file format. (Check it out with Alta Vista!) He also provides a library in Perl and a program to convert a MS Word document into an ASCII file. After some studying (the description wasn't clear in every point) I've written a converter in C (but during office hours, so I can't publish it). It shouldn't take you more than a couple of days, probably less. Though the description talks about the Word 6 format, I could easily read a Word 97 file. Really funny, 'cause Word 6 can't read Word 97 files. The Microsoft file format has actually some good points. They put a whole directory tree into it, it seems. Trouble is, they get *so* big. That really is ridiculous. Some 40 characters for test purposes were converted into 19 kB. On another note, I once stumbled across a page with many file formats. On "request" of Microsoft, the Word format was removed. They really don't want anyone to know. - Peter -- Peter Korsten | peter@grendel.IAEhv.nl (UUCP) | peterk@IAEhv.nl C/C++/Perl/Java hacker