Date: Wed, 19 Jun 2013 15:13:37 GMT From: Corinna Vinschen <vinschen@redhat.com> To: freebsd-gnats-submit@FreeBSD.org Subject: misc/179721: char<->wchar_t mismatch in glob(3), fnmatch(3), regexec(3) Message-ID: <201306191513.r5JFDbXa054868@oldred.freebsd.org> Resent-Message-ID: <201306191520.r5JFK0Ta039071@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 179721 >Category: misc >Synopsis: char<->wchar_t mismatch in glob(3), fnmatch(3), regexec(3) >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Wed Jun 19 15:20:00 UTC 2013 >Closed-Date: >Last-Modified: >Originator: Corinna Vinschen >Release: none >Organization: Red Hat >Environment: CYGWIN_NT-6.2-WOW64 VMBERT864 1.7.21(0.266/5/3) 2013-06-17 10:34 i686 Cygwin >Description: Hi, It seems there's a mismatch between char and wchar_t in the glob(3) functionality. I stumbled over this problem, because Cygwin is using FreeBSD's glob, fnmatch, and regcomp code. All three functions convert input strings to wide character and do test and comparisons on the wide char representation. All three functions call the __collate_range_cmp function in some scenario (glob ,for instance, in match() when a range pattern is handled). However, while all three functions operate on wchar_t chars, the __collate_range_cmp function in locale/collcmp.c converts the characters to char and calls strcoll_l on them. This results in a comparison which only works with ASCII chars, but not with the full UNICODE character range. An easy solution might be to call wcscoll_l from __collate_range_cmp, but __collate_range_cmp is also called from other places, namely from vfscanf, with char input. Therefore the best way out might be to introduce something along the lines of a __wcollate_range_cmp function, as outlined below. >How-To-Repeat: >Fix: See attached patch for a suggestion Patch attached with submission follows: Index: lib/libc/gen/fnmatch.c =================================================================== RCS file: /home/ncvs/src/lib/libc/gen/fnmatch.c,v retrieving revision 1.21 diff -u -p -r1.21 fnmatch.c --- lib/libc/gen/fnmatch.c 17 Nov 2012 01:49:24 -0000 1.21 +++ lib/libc/gen/fnmatch.c 19 Jun 2013 15:12:34 -0000 @@ -285,8 +285,8 @@ rangematch(pattern, test, flags, newp, p if (table->__collate_load_error ? c <= test && test <= c2 : - __collate_range_cmp(table, c, test) <= 0 - && __collate_range_cmp(table, test, c2) <= 0 + __wcollate_range_cmp(table, c, test) <= 0 + && __wcollate_range_cmp(table, test, c2) <= 0 ) ok = 1; } else if (c == test) Index: lib/libc/gen/glob.c =================================================================== RCS file: /home/ncvs/src/lib/libc/gen/glob.c,v retrieving revision 1.36 diff -u -p -r1.36 glob.c --- lib/libc/gen/glob.c 12 Apr 2013 00:41:52 -0000 1.36 +++ lib/libc/gen/glob.c 19 Jun 2013 15:12:34 -0000 @@ -836,8 +836,8 @@ match(Char *name, Char *pat, Char *paten if ((*pat & M_MASK) == M_RNG) { if (table->__collate_load_error ? CHAR(c) <= CHAR(k) && CHAR(k) <= CHAR(pat[1]) : - __collate_range_cmp(table, CHAR(c), CHAR(k)) <= 0 - && __collate_range_cmp(table, CHAR(k), CHAR(pat[1])) <= 0 + __wcollate_range_cmp(table, CHAR(c), CHAR(k)) <= 0 + && __wcollate_range_cmp(table, CHAR(k), CHAR(pat[1])) <= 0 ) ok = 1; pat += 2; Index: lib/libc/locale/collate.h =================================================================== RCS file: /home/ncvs/src/lib/libc/locale/collate.h,v retrieving revision 1.17 diff -u -p -r1.17 collate.h --- lib/libc/locale/collate.h 17 Nov 2012 01:49:29 -0000 1.17 +++ lib/libc/locale/collate.h 19 Jun 2013 15:12:34 -0000 @@ -73,6 +73,7 @@ u_char *__collate_substitute(struct xloc int __collate_load_tables(const char *); void __collate_lookup(struct xlocale_collate *, const u_char *, int *, int *, int *); int __collate_range_cmp(struct xlocale_collate *, int, int); +int __wcollate_range_cmp(struct xlocale_collate *, int, int); #ifdef COLLATE_DEBUG void __collate_print_tables(void); #endif Index: lib/libc/locale/collcmp.c =================================================================== RCS file: /home/ncvs/src/lib/libc/locale/collcmp.c,v retrieving revision 1.20 diff -u -p -r1.20 collcmp.c --- lib/libc/locale/collcmp.c 17 Nov 2012 01:49:29 -0000 1.20 +++ lib/libc/locale/collcmp.c 19 Jun 2013 15:12:34 -0000 @@ -50,3 +50,13 @@ int __collate_range_cmp(struct xlocale_c l.components[XLC_COLLATE] = (struct xlocale_component *)table; return (strcoll_l(s1, s2, &l)); } +int __wcollate_range_cmp(struct xlocale_collate *table, int c1, int c2) +{ + static wchar_t s1[2], s2[2]; + + s1[0] = c1; + s2[0] = c2; + struct _xlocale l = {{0}}; + l.components[XLC_COLLATE] = (struct xlocale_component *)table; + return (wcscoll_l(s1, s2, &l)); +} Index: lib/libc/regex/regcomp.c =================================================================== RCS file: /home/ncvs/src/lib/libc/regex/regcomp.c,v retrieving revision 1.42 diff -u -p -r1.42 regcomp.c --- lib/libc/regex/regcomp.c 2 Mar 2013 01:08:09 -0000 1.42 +++ lib/libc/regex/regcomp.c 19 Jun 2013 15:12:34 -0000 @@ -789,10 +789,10 @@ p_b_term(struct parse *p, cset *cs) (void)REQUIRE((uch)start <= (uch)finish, REG_ERANGE); CHaddrange(p, cs, start, finish); } else { - (void)REQUIRE(__collate_range_cmp(table, start, finish) <= 0, REG_ERANGE); + (void)REQUIRE(__wcollate_range_cmp(table, start, finish) <= 0, REG_ERANGE); for (i = 0; i <= UCHAR_MAX; i++) { - if ( __collate_range_cmp(table, start, i) <= 0 - && __collate_range_cmp(table, i, finish) <= 0 + if ( __wcollate_range_cmp(table, start, i) <= 0 + && __wcollate_range_cmp(table, i, finish) <= 0 ) CHadd(p, cs, i); } >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201306191513.r5JFDbXa054868>