From owner-freebsd-standards@FreeBSD.ORG Mon Sep 2 15:41:04 2013 Return-Path: Delivered-To: freebsd-standards@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 5CB2AF6A; Mon, 2 Sep 2013 15:41:04 +0000 (UTC) (envelope-from dweber@htw-saarland.de) Received: from triton.rz.uni-saarland.de (triton.rz.uni-saarland.de [134.96.7.25]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id E1CFB2C39; Mon, 2 Sep 2013 15:41:03 +0000 (UTC) Received: from itz-mail.htw-saarland.de (itz-mail.htw-saarland.de [134.96.210.141]) by triton.rz.uni-saarland.de (8.14.1/8.14.0) with ESMTP id r82F9dxJ002236; Mon, 2 Sep 2013 17:09:39 +0200 Received: from magritte.htw-saarland.de (magritte.htw-saarland.de [134.96.216.98]) by itz-mail.htw-saarland.de (8.14.5/8.14.5) with ESMTP id r82F9dco026047; Mon, 2 Sep 2013 17:09:39 +0200 (CEST) Date: Mon, 2 Sep 2013 17:09:33 +0200 (CEST) From: Damian Weber To: Andriy Gapon Subject: Re: bug with special bracket expressions in regular expressions In-Reply-To: <5224A693.3000904@FreeBSD.org> Message-ID: References: <5224A693.3000904@FreeBSD.org> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: clamav-milter 0.97.3 at itz-mail X-Virus-Status: Clean X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0 (triton.rz.uni-saarland.de [134.96.7.25]); Mon, 02 Sep 2013 17:09:39 +0200 (CEST) X-AntiVirus: checked by AntiVir MailGate (version: 2.1.2-14; AVE: 7.9.10.68; VDF: 7.11.99.164; host: AntiVir3) Cc: FreeBSD Current , freebsd-standards@FreeBSD.org X-BeenThere: freebsd-standards@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Standards compliance List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Sep 2013 15:41:04 -0000 On Mon, 2 Sep 2013, Andriy Gapon wrote: > re_format(7) says: > There are two special cases? of bracket expressions: the bracket expres? > sions ?[[:<:]]? and ?[[:>:]]? match the null string at the beginning and > end of a word respectively. A word is defined as a sequence of word > characters which is neither preceded nor followed by word characters. A > word character is an alnum character (as defined by ctype(3)) or an > underscore. This is an extension, compatible with but not specified by > IEEE Std 1003.2 (?POSIX.2?), and should be used with caution in software > intended to be portable to other systems. > > However I observe the following: > $ echo "cd0 cd1 xx" | sed 's/cd[0-9][^ ]* *//g' > xx > $ echo "cd0 cd1 xx" | sed 's/[[:<:]]cd[0-9][^ ]* *//g' > cd1 xx > > In my opinion '[[:<:]]' should not affect how the pattern is matched in this case. > > Any thoughts, suggestions? there are two simpler expressions, whose difference I don't understand either (tested on 8.4-PRERELEASE) $ echo "cd0 cd1 xx" | sed 's/cd[0-9] //g' xx $ echo "cd0 cd1 xx" | sed 's/[[:<:]]cd[0-9] //g' cd1 xx -- Damian