Date: Tue, 16 Jan 2007 20:15:34 -0600 From: "Parker Anderson" <baka.rob@gmail.com> To: "applecom@inbox.ru" <applecom@inbox.ru> Cc: freebsd-questions@freebsd.org Subject: Re: regexp [. .] Message-ID: <ff4d9d1d0701161815u28938c6ds7614c525dd090eb3@mail.gmail.com> In-Reply-To: <op.tl9w6ajvhbloih@xml.opera.com> References: <op.tl9w6ajvhbloih@xml.opera.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 1/16/07, applecom@inbox.ru <applecom@inbox.ru> wrote: > I need to use regular expressions with a sequence of characters as a > collating element. > From re_format(7): > "Within a bracket expression, a collating element (a character, a multi- > character sequence that collates as if it were a single character, or a > collating-sequence name for either) enclosed in `[.' and `.]' stands for > the sequence of characters of that collating element. The sequence is a > single element of the bracket expression's list. A bracket expression > containing a multi-character collating element can thus match more than > one character, e.g. if the collating sequence includes a `ch' collating > element, then the RE `[[.ch.]]*c' matches the first five characters of > `chchcc'." > But grep (and other programs using regexp) writes on "echo somepattern | > grep -Ee 'some[^[.pattern.]]'": > "Invalid collation character". What's wrong? After some searching around I found the following post (albeit for a different project): http://permalink.gmane.org/gmane.comp.gnu.utils.bugs/11462 An excerpt: " I have to admit to having no experience with collating characters. That said, I'll convey my understanding of them. You cannot use [. and .] to group an arbitrary pair of characters together. Collating characters are defined by the locale in which you're running, and only those defined by the locale are available for use inside [. and .]. They usually have names, defined by the locale; the name may or may not be the actual sequence of characters, such [as] '[.ch.].' " I'm not sure, myself, but I hope that helps and isn't far from the truth ;) If anyone knows otherwise, please let me know. Is there a certain match you are trying to pattern? From the looks of it, [ch]* would match a similar set of characters, but it isn't as strict about which pattern they should be in. Sincerely, -Parker
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?ff4d9d1d0701161815u28938c6ds7614c525dd090eb3>