Date: Tue, 9 Aug 2011 10:25:30 -0700 From: Kurt Buff <kurt.buff@gmail.com> To: freebsd-questions@freebsd.org Subject: Re: extracting text from docx files Message-ID: <CADy1Ce4bECa4WCpOcAFCywuM3rVcjvbJT5=ZznSOWqCBb7zumg@mail.gmail.com> In-Reply-To: <20110809133632.GA37445@mech-cluster241.men.bris.ac.uk> References: <20110809133632.GA37445@mech-cluster241.men.bris.ac.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Aug 9, 2011 at 06:36, Anton Shterenlikht <mexas@bristol.ac.uk> wrot= e: > I often receive information in *.docx format > from my MS using colleagues. Sometimes I can > ask for a pdf (or similar) instead, but not always. > > Usually I unzip a docx and then search > through all *xml =C2=A0files to find the > useful data. However, I can't find any > xml styles to use, so I have to convert > the relevant xml file(s) to plain text > by hand. I wonder if anybody can suggest > a better way. Perhaps there's something > in ports that can help. My installation of OpenOffice 3.3 on my Win7 machine will open a Winword 2010 .docx file. I'm guessing it will do the same on FreeBSD, but I don't have an install with a GUI running at the moment. Kurt
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CADy1Ce4bECa4WCpOcAFCywuM3rVcjvbJT5=ZznSOWqCBb7zumg>