Date: Fri, 10 Nov 2017 08:59:46 -0500 From: "James B. Byrne" <byrnejb@harte-lyne.ca> To: mfv@bway.net Cc: freebsd-questions@freebsd.org Subject: Re: Regex character and collation class documentation Message-ID: <68be33ca89aab31e068253dffe129021.squirrel@webmail.harte-lyne.ca> In-Reply-To: <mailman.90.1510315202.51235.freebsd-questions@freebsd.org> References: <mailman.90.1510315202.51235.freebsd-questions@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, November 9, 2017 16:36, mfv wrote:
>> On Wed, 2017-11-08 at 12:47 "James B. Byrne via freebsd-questions"
>>However I see no reference to [.NUL.] anywhere.  The sed man page has
>>no reference to nul or NUL at all and tr only has this to say:
>>
>>     The tr utility has historically not permitted the manipulation
>>     of NUL bytes in its input and, additionally, stripped NUL's from
>>     its input stream.  This implementation has removed this behavior
>>     as a bug.
>>
>>
>>Is there a master list of character/collation classes for FreeBSD
>>regex?  I have read the man pages for grep and re_format.  In no case
>>is the character or collation class NUL mentioned.
>>
>>Where is the usage of [.NUL.] documented?
>>
>
> Hello James,
>
> This may help you with a bit of hacking.
>
> I asked myself the same question but could not find a satisfactory
> answer.  After remembering that "man ascii" has names for all
> non-printable ASCII characters, I placed some of these characters in a
> text file and then removed the same characters using their name.
>
> Thus:
>  - the character ^@ was removed using [[.NUL.]]
>  - the character ^G was removed using [[.BEL.]]
>  - the character ^F was removed using [[.ACK.]]
>  - etc,
>
> I did not try all non-printable characters but a large sampling
> followed this pattern.  Trying to use SP for a space produced the
> following error:
>
> sed: 1: "/[[.SP.]]/d": RE error: invalid collating element
>
> Perhaps there are other exceptions similar to SP.
>
> This syntax also recognises printable characters as well.  For example
> the character 'A' was removed using 's/[[.A.]]//g'.
>
> I would have preferred some formal documentation on this matter but
> like yourself am still searching.
>
> Cheers ...
>
> Marek
>
>
Thank you.  I discovered that a [.<symbol>.] collation reference
pertains to the active LOCALE setting as defined by LC_ALL. At least
so I find in the documentation I have read.  But I would not have
thought to look in man ascii for the answer to my question.
-- 
***          e-Mail is NOT a SECURE channel          ***
        Do NOT transmit sensitive data via e-Mail
 Do NOT open attachments nor follow links sent by e-Mail
James B. Byrne                mailto:ByrneJB@Harte-Lyne.ca
Harte & Lyne Limited          http://www.harte-lyne.ca
9 Brockley Drive              vox: +1 905 561 1241
Hamilton, Ontario             fax: +1 905 561 0757
Canada  L8E 3C3
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?68be33ca89aab31e068253dffe129021.squirrel>
