From owner-freebsd-questions@FreeBSD.ORG Fri May 28 07:01:08 2010 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E0DE9106567D for ; Fri, 28 May 2010 07:01:08 +0000 (UTC) (envelope-from freebsd@edvax.de) Received: from mx01.qsc.de (mx01.qsc.de [213.148.129.14]) by mx1.freebsd.org (Postfix) with ESMTP id 8EDB88FC1C for ; Fri, 28 May 2010 07:01:08 +0000 (UTC) Received: from r55.edvax.de (port-92-195-249-33.dynamic.qsc.de [92.195.249.33]) by mx01.qsc.de (Postfix) with ESMTP id 5B8153DC2F; Fri, 28 May 2010 09:00:58 +0200 (CEST) Received: from r55.edvax.de (localhost [127.0.0.1]) by r55.edvax.de (8.14.2/8.14.2) with SMTP id o4S70v6U001682; Fri, 28 May 2010 09:00:57 +0200 (CEST) (envelope-from freebsd@edvax.de) Date: Fri, 28 May 2010 09:00:57 +0200 From: Polytropon To: Gary Kline Message-Id: <20100528090057.87144ef4.freebsd@edvax.de> In-Reply-To: <20100527233607.GD19297@thought.org> References: <20100527013843.GA40751@thought.org> <20100527050302.da39c258.freebsd@edvax.de> <20100527233607.GD19297@thought.org> Organization: EDVAX X-Mailer: Sylpheed 2.4.7 (GTK+ 2.12.1; i386-portbld-freebsd7.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: FreeBSD Mailing List Subject: Re: any shortcuts to doc to ascii? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Polytropon List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 May 2010 07:01:09 -0000 On Thu, 27 May 2010 16:36:08 -0700, Gary Kline wrote: > i don't see any ascii suffix [for OOo]. i saved as .txt. This should be right. The .txt extension refers to ASCII text, at least in standard-compliant operating systems. > same krap. the \x94, x9d, \x9c... same with catdoc. i'll > try antiword. [forgot about that. ] This makes me believe that the original DOC file has been created with a wrong character set or language setting. "Windows" - as far as I know - does not use standard locales such as all other systems do, but uses an arbitrary setting. Another idea may be that the character that you think should be an apostrophe isn't an apostrophe. I often do see this in german texts with misplaces apostrophes that are in fact accent grave or accent acute, or a character from UTF-8 that just looks like an apostrophe. For example, if the original document contains We don`t and this ` is not a real ', then conversion tools will of course use the "escape notation" for this unknown character. Other characters that may lead to such "escape notation" replacements can be quotation marks (usually typographical ones), ellipsis and hyphens. I know I'm saying this too often, but you wouldn't have such problems with LaTeX. :-) > > I'm not sure in how far conflicting codepages may be involved. > > It is known that "Windows" does have problems supporting standards, > > and this applies to character sets and language variations, too. > > > > your words could be emblazoned in 24k gold on some Monument > of Truth. It's my job - I'm working for the Ministry of Truth. :-) > i've been fighting going for mac to OOo and back... Keep on fighting - I've got a new idea. It's much more complicated than using OpenOffice for conversion - but it MIGHT work. 1. Open the DOC file in OpenOffice. 2. Mark all content you want to convert, e. g. Ctrl+A. 3. Get it into edit buffer, Ctrl+C. 4. Open KDE's text editor (or any other text editor you have installed), output the edit buffer, Ctrl+V. 5. Save the file you now got in the editor. It should be all in ASCII and with correct interpretation of "special characters". Because I don't have a test setting here, I cannot predict that it will compensate malformed codings, but if OpenOffice shows a character as an apostrophe, it should be transferred exactly as that through the edit buffer. > ps: antiword same as catdoc. back to my per substitutions. > that works, along with vi's Builtin subs. The joy of modern programs: You start to do everything manually again. :-) -- Polytropon Magdeburg, Germany Happy FreeBSD user since 4.0 Andra moi ennepe, Mousa, ...