From owner-svn-src-stable@freebsd.org Wed Aug 16 17:42:40 2017 Return-Path: Delivered-To: svn-src-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 77C6EDC3721; Wed, 16 Aug 2017 17:42:40 +0000 (UTC) (envelope-from kevans@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5333B67D1F; Wed, 16 Aug 2017 17:42:40 +0000 (UTC) (envelope-from kevans@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v7GHgdjS054783; Wed, 16 Aug 2017 17:42:39 GMT (envelope-from kevans@FreeBSD.org) Received: (from kevans@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v7GHgdB4054781; Wed, 16 Aug 2017 17:42:39 GMT (envelope-from kevans@FreeBSD.org) Message-Id: <201708161742.v7GHgdB4054781@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kevans set sender to kevans@FreeBSD.org using -f From: Kyle Evans Date: Wed, 16 Aug 2017 17:42:39 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r322583 - in stable/11: contrib/netbsd-tests/usr.bin/grep usr.bin/grep X-SVN-Group: stable-11 X-SVN-Commit-Author: kevans X-SVN-Commit-Paths: in stable/11: contrib/netbsd-tests/usr.bin/grep usr.bin/grep X-SVN-Commit-Revision: 322583 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for all the -stable branches of the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Aug 2017 17:42:40 -0000 Author: kevans Date: Wed Aug 16 17:42:39 2017 New Revision: 322583 URL: https://svnweb.freebsd.org/changeset/base/322583 Log: MFC r317665: bsdgrep: fix -w -v matching improperly with certain patterns -w and -v flag matching was mostly functional but had some minor problems: 1. -w flag processing only allowed one iteration through pattern matching on a line. This was problematic if one pattern could match more than once, or if there were multiple patterns and the earliest/ longest match was not the most ideal, and 2. Previous work "fixed" things to not further process a line if the first iteration through patterns produced no matches. This is clearly wrong if we're dealing with the more restrictive -w matching. #2 breakage could have also occurred before recent broad rewrites, but it would be more arbitrary based on input patterns as to whether or not it actually affected things. Fix both of these by forcing a retry of the patterns after advancing just past the start of the first match if we're doing more restrictive -w matching and we didn't get any hits to start with. Also move -v flag processing outside of the loop so that we have a greater change to match in the more restrictive cases. This wasn't strictly wrong, but it could be a little more error prone. While here, introduce some regressions tests for this behavior and fix some excessive wrapping nearby that hindered readability. GNU grep passes these new tests. PR: 218467, 218811 Approved by: emaste (mentor, blanket MFC) Modified: stable/11/contrib/netbsd-tests/usr.bin/grep/t_grep.sh stable/11/usr.bin/grep/util.c Directory Properties: stable/11/ (props changed) Modified: stable/11/contrib/netbsd-tests/usr.bin/grep/t_grep.sh ============================================================================== --- stable/11/contrib/netbsd-tests/usr.bin/grep/t_grep.sh Wed Aug 16 17:38:37 2017 (r322582) +++ stable/11/contrib/netbsd-tests/usr.bin/grep/t_grep.sh Wed Aug 16 17:42:39 2017 (r322583) @@ -93,6 +93,12 @@ word_regexps_body() { atf_check -o file:"$(atf_get_srcdir)/d_word_regexps.out" \ grep -w separated $(atf_get_srcdir)/d_input + + # Begin FreeBSD + printf "xmatch pmatch\n" > test1 + + atf_check -o inline:"pmatch\n" grep -Eow "(match )?pmatch" test1 + # End FreeBSD } atf_test_case begin_end @@ -439,6 +445,23 @@ grep_sanity_body() atf_check -o inline:"M\n" grep -o -e "M\{1\}" test2 } + +atf_test_case wv_combo_break +wv_combo_break_head() +{ + atf_set "descr" "Check for incorrectly matching lines with both -w and -v flags (PR 218467)" +} +wv_combo_break_body() +{ + printf "x xx\n" > test1 + printf "xx x\n" > test2 + + atf_check -o file:test1 grep -w "x" test1 + atf_check -o file:test2 grep -w "x" test2 + + atf_check -s exit:1 grep -v -w "x" test1 + atf_check -s exit:1 grep -v -w "x" test2 +} # End FreeBSD atf_init_test_cases() @@ -467,6 +490,7 @@ atf_init_test_cases() atf_add_test_case escmap atf_add_test_case egrep_empty_invalid atf_add_test_case zerolen + atf_add_test_case wv_combo_break atf_add_test_case fgrep_sanity atf_add_test_case egrep_sanity atf_add_test_case grep_sanity Modified: stable/11/usr.bin/grep/util.c ============================================================================== --- stable/11/usr.bin/grep/util.c Wed Aug 16 17:38:37 2017 (r322582) +++ stable/11/usr.bin/grep/util.c Wed Aug 16 17:42:39 2017 (r322583) @@ -305,6 +305,7 @@ procline(struct str *l, int nottext) unsigned int i; int c = 0, m = 0, r = 0, lastmatches = 0, leflags = eflags; int startm = 0; + int retry; /* Initialize to avoid a false positive warning from GCC. */ lastmatch.rm_so = lastmatch.rm_eo = 0; @@ -313,6 +314,7 @@ procline(struct str *l, int nottext) while (st <= l->len) { lastmatches = 0; startm = m; + retry = 0; if (st > 0) leflags |= REG_NOTBOL; /* Loop to compare with all the patterns */ @@ -356,6 +358,17 @@ procline(struct str *l, int nottext) else if (iswword(wbegin) || iswword(wend)) r = REG_NOMATCH; + /* + * If we're doing whole word matching and we + * matched once, then we should try the pattern + * again after advancing just past the start of + * the earliest match. This allows the pattern + * to match later on in the line and possibly + * still match a whole word. + */ + if (r == REG_NOMATCH && + (retry == 0 || pmatch.rm_so + 1 < retry)) + retry = pmatch.rm_so + 1; } if (r == 0) { lastmatches++; @@ -385,9 +398,14 @@ procline(struct str *l, int nottext) } } - if (vflag) { - c = !c; - break; + /* + * Advance to just past the start of the earliest match, try + * again just in case we still have a chance to match later in + * the string. + */ + if (lastmatches == 0 && retry > 0) { + st = retry; + continue; } /* One pass if we are not recording matches */ @@ -409,6 +427,9 @@ procline(struct str *l, int nottext) st = nst; } + + if (vflag) + c = !c; /* Count the matches if we have a match limit */ if (mflag)