Date: Tue, 27 May 2008 08:29:03 +0200 From: Karel Miklav <karel@inetis.com> To: Oliver Fromme <olli@lurza.secnetix.de> Cc: delphij@freebsd.org, chinsan <chinsan.tw@gmail.com>, freebsd-questions@FreeBSD.ORG Subject: Re: Sed, shell and hexadecimal character codes Message-ID: <483BAA2F.30009@inetis.com> In-Reply-To: <200805231523.m4NFNOwO024115@lurza.secnetix.de> References: <200805231523.m4NFNOwO024115@lurza.secnetix.de>
next in thread | previous in thread | raw e-mail | index | archive | help
Oliver Fromme wrote: > Karel Miklav wrote: > > There's a tip in the FreeBSD fortunes database that says: > > > > > Want to strip UTF-8 BOM(Bye Order Mark) from given files? > > > > > > sed -e '1s/^\xef\xbb\xbf//' < bomfile > newfile > > FreeBSD's sed(1) doesn't support hexadecimal or octal > sequences. I think even gnu sed doesn't support it, but > you might try it yourself (/usr/ports/textprog/gsed). > > I don't know why that fortunes entry exist. It's wrong. That's what I thought. Maybe we should replace the recipe with the awk version Oliver proposed below? > > I can't make it work, and I can't find any other method to > > work with hexa codes in scripts or on the command line so > > I'm kind-a depressed :) I help myself with xxd now, but if > > it is possible to avoid it, I'd like to hear about it. > > There is no standard for handling octal and hexadecimal > sequences, unfortunately, so you have to consult the > manual page to find out. For example, tr(1) supports > octal sequences only (no hexadecimal), while awk(1) > supports both. So the above line could be rewritten > with awk: > > awk '{if(NR==1)sub(/^\xef\xbb\xbf/, "");print}' < bomfile > newfile
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?483BAA2F.30009>