Date: Tue, 22 Oct 2002 12:15:59 -0500 From: D J Hawkey Jr <hawkeyd@visi.com> To: questions at FreeBSD <freebsd-questions@freebsd.org> Subject: OT: regex(3) and POSIX collating sequences Message-ID: <20021022121559.A86362@sheol.localdomain>
next in thread | raw e-mail | index | archive | help
Hi.
This is rather off-topic, but as the trouble I'm having is on a FreeBSD
box, I'm hoping you'll excuse me.
What's up with collating sequences and the regcomp(3) function? From the
re_format(7) man page:
Within a bracket expression, a collating element (a character, a multi-
character sequence that collates as if it were a single character, or a
collating-sequence name for either) enclosed in `[.' and `.]' stands for
the sequence of characters of that collating element. The sequence is a
single element of the bracket expression's list. A bracket expression
containing a multi-character collating element can thus match more than
one character, e.g. if the collating sequence includes a `ch' collating
element, then the RE `[[.ch.]]*c' matches the first five characters of
`chchcc'.
But darned if I can get it to work:
$ echo "ZXCV asdf qwer" |sed -e "s/[^[.ZXCV.][.1234.]]/zxcv/"
sed: 1: "s/[^[.ZXCV.][.1234.]]/zxcv/
": RE error: invalid collating element
Foolishness, yes, but it illustrates my problem nicely. I've got a program
that uses REs, and it reports this error when I try to use a "[[.phrase.]]"
bracket syntax. Relevant code example:
#include <sys/types.h>
#include <regex.h>
#define REGCOMP_FLAGS (REG_EXTENDED | REG_NOSUB)
regex_t re;
int result;
char *phrase = "[^[.ZXCV.][.1234.]]";
char buffer[256];
if ((result = regcomp(&re, phrase, REGCOMP_FLAGS)) != 0)
{
regerror(result, &re, buffer, sizeof(buffer));
regfree(&re);
fprintf(stderr, "regcomp(\"%s\") error: %s\n", phrase, buffer);
}
This works for everything I've thrown at it except for a "[[.whatever.]]"
bracket expression. regcomp(3) refuses to compile it. The REG_NOSUB is
intentional; I only need to know that a match occurs with regexec(3).
What the devil have I missed? Or, what is an acceptable RE that matches
"anything except "ZXCV" or "1234""?
Please CC: me, I'm not subscribed. Thanks,
Dave
--
______________________ ______________________
\__________________ \ D. J. HAWKEY JR. / __________________/
\________________/\ hawkeyd@visi.com /\________________/
http://www.visi.com/~hawkeyd/
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20021022121559.A86362>
