From owner-freebsd-standards@FreeBSD.ORG Mon Sep 2 14:55:09 2013 Return-Path: Delivered-To: freebsd-standards@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id C6B13D2C; Mon, 2 Sep 2013 14:55:09 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id E6D242985; Mon, 2 Sep 2013 14:55:08 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id RAA23213; Mon, 02 Sep 2013 17:55:07 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1VGVX0-000N6R-SQ; Mon, 02 Sep 2013 17:55:06 +0300 Message-ID: <5224A693.3000904@FreeBSD.org> Date: Mon, 02 Sep 2013 17:54:11 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130810 Thunderbird/17.0.8 MIME-Version: 1.0 To: FreeBSD Current , freebsd-standards@FreeBSD.org Subject: bug with special bracket expressions in regular expressions X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-standards@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Standards compliance List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Sep 2013 14:55:09 -0000 re_format(7) says: There are two special cases‡ of bracket expressions: the bracket expres‐ sions ‘[[:<:]]’ and ‘[[:>:]]’ match the null string at the beginning and end of a word respectively. A word is defined as a sequence of word characters which is neither preceded nor followed by word characters. A word character is an alnum character (as defined by ctype(3)) or an underscore. This is an extension, compatible with but not specified by IEEE Std 1003.2 (“POSIX.2”), and should be used with caution in software intended to be portable to other systems. However I observe the following: $ echo "cd0 cd1 xx" | sed 's/cd[0-9][^ ]* *//g' xx $ echo "cd0 cd1 xx" | sed 's/[[:<:]]cd[0-9][^ ]* *//g' cd1 xx In my opinion '[[:<:]]' should not affect how the pattern is matched in this case. Any thoughts, suggestions? Thank you! -- Andriy Gapon