From owner-freebsd-questions@FreeBSD.ORG  Fri May 23 15:23:26 2008
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@FreeBSD.ORG
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 81C1B1065679
	for <freebsd-questions@FreeBSD.ORG>;
	Fri, 23 May 2008 15:23:26 +0000 (UTC)
	(envelope-from olli@lurza.secnetix.de)
Received: from lurza.secnetix.de (unknown [IPv6:2a01:170:102f::2])
	by mx1.freebsd.org (Postfix) with ESMTP id D32978FC1E
	for <freebsd-questions@FreeBSD.ORG>;
	Fri, 23 May 2008 15:23:25 +0000 (UTC)
	(envelope-from olli@lurza.secnetix.de)
Received: from lurza.secnetix.de (localhost [127.0.0.1])
	by lurza.secnetix.de (8.14.1/8.14.1) with ESMTP id m4NFNOsx024116;
	Fri, 23 May 2008 17:23:24 +0200 (CEST)
	(envelope-from oliver.fromme@secnetix.de)
Received: (from olli@localhost)
	by lurza.secnetix.de (8.14.1/8.14.1/Submit) id m4NFNOwO024115;
	Fri, 23 May 2008 17:23:24 +0200 (CEST) (envelope-from olli)
Date: Fri, 23 May 2008 17:23:24 +0200 (CEST)
Message-Id: <200805231523.m4NFNOwO024115@lurza.secnetix.de>
From: Oliver Fromme <olli@lurza.secnetix.de>
To: freebsd-questions@FreeBSD.ORG, karel@inetis.com
In-Reply-To: <4833CBAC.801@inetis.com>
X-Newsgroups: list.freebsd-questions
User-Agent: tin/1.8.3-20070201 ("Scotasay") (UNIX)
	(FreeBSD/6.2-STABLE-20070808 (i386))
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-2.1.2
	(lurza.secnetix.de [127.0.0.1]);
	Fri, 23 May 2008 17:23:24 +0200 (CEST)
Cc: 
Subject: Re: Sed, shell and hexadecimal character codes
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 23 May 2008 15:23:26 -0000

Karel Miklav wrote:
 > There's a tip in the FreeBSD fortunes database that says:
 > 
 > > Want to strip UTF-8 BOM(Bye Order Mark) from given files?
 > > 
 > > sed -e '1s/^\xef\xbb\xbf//' < bomfile > newfile

FreeBSD's sed(1) doesn't support hexadecimal or octal
sequences.  I think even gnu sed doesn't support it, but
you might try it yourself (/usr/ports/textprog/gsed).

I don't know why that fortunes entry exist.  It's wrong.

 > I can't make it work, and I can't find any other method to
 > work with hexa codes in scripts or on the command line so
 > I'm kind-a depressed :) I help myself with xxd now, but if
 > it is possible to avoid it, I'd like to hear about it.

There is no standard for handling octal and hexadecimal
sequences, unfortunately, so you have to consult the
manual page to find out.  For example, tr(1) supports
octal sequences only (no hexadecimal), while awk(1)
supports both.  So the above line could be rewritten
with awk:

awk '{if(NR==1)sub(/^\xef\xbb\xbf/, "");print}' < bomfile > newfile

Basically that's exactly the same instruction as the sed
one above, but awk is a little more verbose:

"1" in sed means that the following command should only
affect the first line.  That's what "if(NR==1)" does in
awk.

"s/OLD/NEW/" is the replacement command in sed.  In awk
it looks like "sub(/old/, "new")".

Finally, sed prints all resulting lines by default, while
awk has to be told with an explicit "print" command.
(awk prints lines automatically only if there are no
other commands at all.)

Best regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

'Instead of asking why a piece of software is using "1970s technology,"
start asking why software is ignoring 30 years of accumulated wisdom.'