From owner-freebsd-questions@FreeBSD.ORG Fri Oct 9 17:01:27 2009 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 249581065679 for ; Fri, 9 Oct 2009 17:01:27 +0000 (UTC) (envelope-from olli@lurza.secnetix.de) Received: from lurza.secnetix.de (lurza.secnetix.de [IPv6:2a01:170:102f::2]) by mx1.freebsd.org (Postfix) with ESMTP id 85CA38FC0C for ; Fri, 9 Oct 2009 17:01:26 +0000 (UTC) Received: from lurza.secnetix.de (localhost [127.0.0.1]) by lurza.secnetix.de (8.14.3/8.14.3) with ESMTP id n99H1AnV028832; Fri, 9 Oct 2009 19:01:25 +0200 (CEST) (envelope-from oliver.fromme@secnetix.de) Received: (from olli@localhost) by lurza.secnetix.de (8.14.3/8.14.3/Submit) id n99H19sq028830; Fri, 9 Oct 2009 19:01:09 +0200 (CEST) (envelope-from olli) From: Oliver Fromme Message-Id: <200910091701.n99H19sq028830@lurza.secnetix.de> To: wblock@wonkity.com (Warren Block) Date: Fri, 9 Oct 2009 19:01:09 +0200 (CEST) In-Reply-To: X-Mailer: ELM [version 2.5 PL8] MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-2.1.2 (lurza.secnetix.de [127.0.0.1]); Fri, 09 Oct 2009 19:01:25 +0200 (CEST) Cc: freebsd-questions@freebsd.org Subject: Re: for perl wizards. X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Oct 2009 17:01:27 -0000 Warren Block wrote: > Oliver Fromme wrote: > > Gary Kline wrote: > > > > > > Whenever I save a wordpeocessoe file [OOo, say] into a > > > text file, I get a slew of hex codes to indicate the char to be > > > used. I'm looking for a perl one-liner or script to translate > > > hex back into ', ", -- [that's a dash), and so forth. Why does > > > this fail to trans the hex code to an apostrophe? > > > > > > perl -pi.bak -e 's/\xe2\x80\x99/'/g' > > > > You need to escape the inner quote character, of course. > > I think sed is better suited for this task than perl. > > That's twice now people have suggested sed instead of perl. Why? For > many uses, perl is a better sed than sed. The regex engine is far more > powerful and escapes are much simpler. Neither powerful regexes nor escapes will help in this case. A simple basic regex is more than sufficient (in fact this isn't even a regex, it's a fixed string). And the escaping is a problem of the shell, not perl or sed. And by the way, I stongly disagree that perl's escapes are much simpler. In my opinion perl has the most complex escaping and quoting I have seen in any language so far. The basic UNIX philosophy is to use the smallest or simplest tool that does the job. In this case that's clearly sed. (Not to mention the fact that perl isn't even in FreeBSD's base system, so might not be available at all.) Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M. Handelsregister: Registergericht Muenchen, HRA 74606, Geschäftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün- chen, HRB 125758, Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd "If you think C++ is not overly complicated, just what is a protected abstract virtual base pure virtual private destructor, and when was the last time you needed one?" -- Tom Cargil, C++ Journal