Date: Fri, 5 Sep 2008 12:27:12 -0400 From: "Mark B." <mkbucc@gmail.com> To: "Giorgos Keramidas" <keramida@ceid.upatras.gr> Cc: freebsd-questions@freebsd.org Subject: Re: How to delete non-ASCII chars in file Message-ID: <59f4cb420809050927w71fea733mcf7a2071c24cdc93@mail.gmail.com> In-Reply-To: <87vdxa4p2p.fsf@kobe.laptop> References: <59f4cb420809050714i16ebe30bmd9f325592f05516e@mail.gmail.com> <87vdxa4p2p.fsf@kobe.laptop>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Sep 5, 2008 at 10:58 AM, Giorgos Keramidas <keramida@ceid.upatras.gr> wrote: > $ echo '^Fhello^F' | sed -e 's/[^[:print:]]*//' | hd > 00000000 68 65 6c 6c 6f 06 0a |hello..| > 00000007 > $ Thanks. > The matching pattern is wrong. You need `[^[:print:]]'. The character > class of printable characters is `[:print:]', and you can negate the > pattern with `[^xxxx]' where `xxxx' is the character class; hence the > extra pair of brackets in `[^[:print:]]'. In case you are interested, I've patched the re_format man page with this example. I had read it, and it says :print: is the "name of the character class." I think the concrete example helps clarify things. A follow question--is it possible to use that statement in a Makefile (BSD)? A straight cut 'n paste didn't work, and I couldn't figure out the escaping to make it work. Thanks, m cd to /usr/src/lib/libc/regex/ and apply this patch. --- /dev/null Fri Sep 5 12:12:21 2008 +++ re_format.7 Fri Sep 5 12:18:29 2008 @@ -288,6 +288,10 @@ A locale may provide others. A character class may not be used as an endpoint of a range. .Pp +To match all characters not in a class, use a bracket expression +like this: +.Ql [^[:print:]] . +.Pp There are two special cases\(dd of bracket expressions: the bracket expressions .Ql [[:<:]]
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?59f4cb420809050927w71fea733mcf7a2071c24cdc93>