From owner-freebsd-questions@freebsd.org Fri Nov 10 17:37:10 2017 Return-Path: Delivered-To: freebsd-questions@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 31514E72B27 for ; Fri, 10 Nov 2017 17:37:10 +0000 (UTC) (envelope-from byrnejb@harte-lyne.ca) Received: from inet08.hamilton.harte-lyne.ca (inet08.hamilton.harte-lyne.ca [216.185.71.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "inet08.hamilton.harte-lyne.ca", Issuer "CA_HLL_ISSUER_2016" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id EF4FC7BDB3 for ; Fri, 10 Nov 2017 17:37:09 +0000 (UTC) (envelope-from byrnejb@harte-lyne.ca) Received: from localhost (localhost [127.0.0.1]) by inet08.hamilton.harte-lyne.ca (Postfix) with ESMTP id E5292622E1; Fri, 10 Nov 2017 08:59:47 -0500 (EST) X-Virus-Scanned: amavisd-new at harte-lyne.ca Received: from inet08.hamilton.harte-lyne.ca ([127.0.0.1]) by localhost (inet08.hamilton.harte-lyne.ca [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RsQBTrCH0odX; Fri, 10 Nov 2017 08:59:46 -0500 (EST) Received: from webmail.harte-lyne.ca (inet04.hamilton.harte-lyne.ca [216.185.71.24]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by inet08.hamilton.harte-lyne.ca (Postfix) with ESMTPSA id A0BDF604FE; Fri, 10 Nov 2017 08:59:45 -0500 (EST) Received: from 216.185.71.44 (SquirrelMail authenticated user byrnejb_hll) by webmail.harte-lyne.ca with HTTP; Fri, 10 Nov 2017 08:59:46 -0500 Message-ID: <68be33ca89aab31e068253dffe129021.squirrel@webmail.harte-lyne.ca> In-Reply-To: References: Date: Fri, 10 Nov 2017 08:59:46 -0500 Subject: Re: Regex character and collation class documentation From: "James B. Byrne" To: mfv@bway.net Cc: freebsd-questions@freebsd.org Reply-To: byrnejb@harte-lyne.ca User-Agent: SquirrelMail/1.4.22-5.el6 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Nov 2017 17:37:10 -0000 On Thu, November 9, 2017 16:36, mfv wrote: >> On Wed, 2017-11-08 at 12:47 "James B. Byrne via freebsd-questions" >>However I see no reference to [.NUL.] anywhere. The sed man page has >>no reference to nul or NUL at all and tr only has this to say: >> >> The tr utility has historically not permitted the manipulation >> of NUL bytes in its input and, additionally, stripped NUL's from >> its input stream. This implementation has removed this behavior >> as a bug. >> >> >>Is there a master list of character/collation classes for FreeBSD >>regex? I have read the man pages for grep and re_format. In no case >>is the character or collation class NUL mentioned. >> >>Where is the usage of [.NUL.] documented? >> > > Hello James, > > This may help you with a bit of hacking. > > I asked myself the same question but could not find a satisfactory > answer. After remembering that "man ascii" has names for all > non-printable ASCII characters, I placed some of these characters in a > text file and then removed the same characters using their name. > > Thus: > - the character ^@ was removed using [[.NUL.]] > - the character ^G was removed using [[.BEL.]] > - the character ^F was removed using [[.ACK.]] > - etc, > > I did not try all non-printable characters but a large sampling > followed this pattern. Trying to use SP for a space produced the > following error: > > sed: 1: "/[[.SP.]]/d": RE error: invalid collating element > > Perhaps there are other exceptions similar to SP. > > This syntax also recognises printable characters as well. For example > the character 'A' was removed using 's/[[.A.]]//g'. > > I would have preferred some formal documentation on this matter but > like yourself am still searching. > > Cheers ... > > Marek > > Thank you. I discovered that a [..] collation reference pertains to the active LOCALE setting as defined by LC_ALL. At least so I find in the documentation I have read. But I would not have thought to look in man ascii for the answer to my question. -- *** e-Mail is NOT a SECURE channel *** Do NOT transmit sensitive data via e-Mail Do NOT open attachments nor follow links sent by e-Mail James B. Byrne mailto:ByrneJB@Harte-Lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3