Date: Tue, 9 Aug 2011 15:39:47 +0100 From: Anton Shterenlikht <mexas@bristol.ac.uk> To: Rod Person <rodperson@rodperson.com> Cc: Anton Shterenlikht <mexas@bristol.ac.uk>, freebsd-questions@freebsd.org Subject: Re: extracting text from docx files Message-ID: <20110809143947.GA39516@mech-cluster241.men.bris.ac.uk> In-Reply-To: <20110809094026.dea10d7a.rodperson@rodperson.com> References: <20110809133632.GA37445@mech-cluster241.men.bris.ac.uk> <20110809094026.dea10d7a.rodperson@rodperson.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Aug 09, 2011 at 09:40:26AM -0400, Rod Person wrote: > On Tue, 9 Aug 2011 14:36:32 +0100 > Anton Shterenlikht <mexas@bristol.ac.uk> wrote: > > > Usually I unzip a docx and then search > > through all *xml files to find the > > useful data. However, I can't find any > > xml styles to use, so I have to convert > > the relevant xml file(s) to plain text > > by hand. I wonder if anybody can suggest > > a better way. Perhaps there's something > > in ports that can help. > > You could try this for just plain text conversion > http://docx2txt.sourceforge.net/ Thank you Anton -- Anton Shterenlikht Room 2.6, Queen's Building Mech Eng Dept Bristol University University Walk, Bristol BS8 1TR, UK Tel: +44 (0)117 331 5944 Fax: +44 (0)117 929 4423
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110809143947.GA39516>