From owner-svn-src-head@freebsd.org Tue May 2 20:44:07 2017 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7DA1AD5BAD5; Tue, 2 May 2017 20:44:07 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3988017B4; Tue, 2 May 2017 20:44:07 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v42Ki6Ws057847; Tue, 2 May 2017 20:44:06 GMT (envelope-from emaste@FreeBSD.org) Received: (from emaste@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v42Ki6TY057846; Tue, 2 May 2017 20:44:06 GMT (envelope-from emaste@FreeBSD.org) Message-Id: <201705022044.v42Ki6TY057846@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: emaste set sender to emaste@FreeBSD.org using -f From: Ed Maste Date: Tue, 2 May 2017 20:44:06 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r317704 - head/usr.bin/grep/regex X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 May 2017 20:44:07 -0000 Author: emaste Date: Tue May 2 20:44:06 2017 New Revision: 317704 URL: https://svnweb.freebsd.org/changeset/base/317704 Log: bsdgrep: fix escape map building for multibyte strings In BSD grep, fix escape map building in the regex parser. It was previously using memory not explicitly initialized, and the MBS escape map was being built based on a version of the pattern with escapes already parsed out. This is Kyle's change, but I restored the broken style that already exists in this file. Submitted by: Kyle Evans Reviewed by: cem, Kyle Evans (my style changes) Differential Revision: https://reviews.freebsd.org/D10098 Modified: head/usr.bin/grep/regex/tre-fastmatch.c Modified: head/usr.bin/grep/regex/tre-fastmatch.c ============================================================================== --- head/usr.bin/grep/regex/tre-fastmatch.c Tue May 2 20:39:33 2017 (r317703) +++ head/usr.bin/grep/regex/tre-fastmatch.c Tue May 2 20:44:06 2017 (r317704) @@ -98,6 +98,18 @@ static int fastcmp(const fastmatch_t *fg fg->pattern[siz] = '\0'; \ } \ +#define CONV_MBS_PAT(src, dest, destsz) \ + { \ + destsz = wcstombs(NULL, src, 0); \ + if (destsz == (size_t)-1) \ + return REG_BADPAT; \ + dest = malloc(destsz + 1); \ + if (dest == NULL) \ + return REG_ESPACE; \ + wcstombs(dest, src, destsz); \ + dest[destsz] = '\0'; \ + } \ + #define IS_OUT_OF_BOUNDS \ ((!fg->reversed \ ? ((type == STR_WIDE) ? ((j + fg->wlen) > len) \ @@ -723,15 +735,29 @@ badpat: } escaped = false; - for (unsigned int i = 0; i < fg->len; i++) - if (fg->pattern[i] == '\\') - escaped = !escaped; - else if (fg->pattern[i] == '.' && fg->escmap && escaped) + char *_checkpat = NULL; + size_t _checklen = 0; + unsigned int escofs = 0; + /* + * Make a copy here of the original pattern, because fg->pattern has + * already been stripped of all escape sequences in the above processing. + * This is necessary if we wish to later treat fg->escmap as an actual, + * functional replacement of fg->wescmap. + */ + CONV_MBS_PAT(pat, _checkpat, _checklen); + for (unsigned int i = 0; i < n; i++) + if (_checkpat[i] == '\\') + { + escaped = !escaped; + if (escaped) + ++escofs; + } + else if (_checkpat[i] == '.' && fg->escmap != NULL && escaped) { - fg->escmap[i] = true; + fg->escmap[i - escofs] = true; escaped = false; } - else if (fg->pattern[i] == '.' && !escaped) + else if (_checkpat[i] == '.' && !escaped) { hasdot = i; if (firstdot == -1) @@ -739,6 +765,7 @@ badpat: } else escaped = false; + free(_checkpat); } #else SAVE_PATTERN(tmp, pos, fg->pattern, fg->len);