From nobody Thu Mar 20 04:34:19 2025 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4ZJCQj1LzLz5rGZC; Thu, 20 Mar 2025 04:34:21 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4ZJCQh34Srz44SR; Thu, 20 Mar 2025 04:34:20 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1742445260; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=wLX4FfCD/JBnqOlnKlQgsl7ChcQeEv75+b34om3o1Q8=; b=DidBdafixPmIhbhpr//zjcn+q5b8rZIYSKiCRX2RytVvIwGpRf4F8kiPVaCPDo4sCGUjVy XdG5TSmlvTc9peyj2D9iewt3hD6BjdEFAjWIoNsZjrWceCC+WuyQnUKMz3lsoaXDRZNBon miybnKn/WoxIEJW4mXX0GQoLjoZOLUV/q3S8YTkE8ZtbDF5VJ+OOKrfpYDMjk33MT9JHzY Pz4DOv/57e4xJzseSJLkvuKCvyEldfGy8rBlNy3tin8QmoRCM4i4ZLM41VdKbuwD2WKdcy 22Mdu9KKi6taQ5LGH80jAFcqB8hne5zWrGcmq3uAgGKiHZ27jlJNzsFzRI2RSg== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1742445260; a=rsa-sha256; cv=none; b=AbI/5OFDtdcV8RFQnIvHw1UUHJ/u5Da6CWStvLrJIH4U8Y7jiIrPIGoY4pir3wGQw0JIcI RYc8NXv8XhM6WeIYB/iLFqeWzEZQD+Lw2/n47o0EznVy42uatnhodNYsmUU09T/GrDiVjK 5qFpvvOIRr6XLe26llb5ZnI09eKqN2TaAzvbD7gMBBlKxcNfB7zZoKceIx9+RqRoWj2oCl sKf27fuU8/x9mQiedzYCXA0KeKc0Jz7S0AEe8LkOEaUp1EszCl5y90vuuwwd7q2uFG/5fy szGFAOmtxd4sg/+pF22e++PwFoROxu7FwDyD7fU85+4ukYR0Yb24Xoqrtz96mw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1742445260; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=wLX4FfCD/JBnqOlnKlQgsl7ChcQeEv75+b34om3o1Q8=; b=NYOyGCtQBmuhSqXMPNV7R/PnOV3XowJpRiLZgRi/3ttPRhULp9gCBtlpQYcEyWxzHzJjQ4 vZEpQQL3B/f9iRqS8hmu9s/kP44B8oeiFlYEfjxx1h2w9x73GJQm7E1ZGk6EUG6tua6Ku0 IruY+nv/lwm0VCiaKwb9lEw7Bcbfv6vJsogaCHwcr0lRB79md4+x4BvOsvDfBfuHDtuKqG 5Q+N5pZ3Mwj9KNtR9+w5jZSG1DvsfBGO8gJ73dAxsjsfyelaxMcdVrK732v0v21QtORnu/ qBHp1eqwZydSwQYBBiH7h1Eoprs3bGM2Pkc6WTyI3nl9+UDSkVXHMolbPPyR5w== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4ZJCQg5BRkzrGY; Thu, 20 Mar 2025 04:34:19 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 52K4YJaJ037142; Thu, 20 Mar 2025 04:34:19 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 52K4YJGA037139; Thu, 20 Mar 2025 04:34:19 GMT (envelope-from git) Date: Thu, 20 Mar 2025 04:34:19 GMT Message-Id: <202503200434.52K4YJGA037139@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Kyle Evans Subject: git: 4c9ffb13dd74 - main - grep: avoid duplicated lines when we're coloring output List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-all@freebsd.org Sender: owner-dev-commits-src-all@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: kevans X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 4c9ffb13dd74159bd3ed7e1c4c706dbd15a70df2 Auto-Submitted: auto-generated The branch main has been updated by kevans: URL: https://cgit.FreeBSD.org/src/commit/?id=4c9ffb13dd74159bd3ed7e1c4c706dbd15a70df2 commit 4c9ffb13dd74159bd3ed7e1c4c706dbd15a70df2 Author: Kyle Evans AuthorDate: 2025-03-20 04:34:13 +0000 Commit: Kyle Evans CommitDate: 2025-03-20 04:34:13 +0000 grep: avoid duplicated lines when we're coloring output For the default uncolored output, we'll just output a line once and then move on. For colored output, we'll output multiple matches per line with context from the line interspersed and may end up writing out some match context multiple times as we don't persist which part of the lines have already been printed. Fix it by tracking the length of line printed thus far in printline() and retaining it across successive calls to printline() in the same line. printline() should indicate whether it terminated the line or not to avoid tracking the logic for that in multiple places: -o lines are always terminated, so it's generally only some --color contexts where we wouldn't have terminated. Add a test to make sure that we're only printing one line going forward. Reported and tested by: Jamie Landeg-Jones Reviewed by: emaste Differential Revision: https://reviews.freebsd.org/D49324 --- usr.bin/grep/tests/grep_freebsd_test.sh | 15 +++++++ usr.bin/grep/util.c | 72 +++++++++++++++++++++++++++------ 2 files changed, 74 insertions(+), 13 deletions(-) diff --git a/usr.bin/grep/tests/grep_freebsd_test.sh b/usr.bin/grep/tests/grep_freebsd_test.sh index 77017529843e..906b70645151 100755 --- a/usr.bin/grep/tests/grep_freebsd_test.sh +++ b/usr.bin/grep/tests/grep_freebsd_test.sh @@ -103,10 +103,25 @@ zflag_body() atf_check grep -qz "foo.*bar" in } +atf_test_case color_dupe +color_dupe_body() +{ + + # This assumes a MAX_MATCHES of exactly 32. Previously buggy procline() + # calls would terminate the line premature every MAX_MATCHES matches, + # meaning we'd see the line be output again for the next MAX_MATCHES + # number of matches. + jot -nb 'A' -s '' 33 > in + + atf_check -o save:color.out grep --color=always . in + atf_check -o match:"^ +1 color.out" wc -l color.out +} + atf_init_test_cases() { atf_add_test_case grep_r_implied atf_add_test_case rgrep atf_add_test_case gnuext atf_add_test_case zflag + atf_add_test_case color_dupe } diff --git a/usr.bin/grep/util.c b/usr.bin/grep/util.c index 4e1c44b442f2..ed87e56956f6 100644 --- a/usr.bin/grep/util.c +++ b/usr.bin/grep/util.c @@ -72,7 +72,7 @@ static int litexec(const struct pat *pat, const char *string, size_t nmatch, regmatch_t pmatch[]); #endif static bool procline(struct parsec *pc); -static void printline(struct parsec *pc, int sep); +static bool printline(struct parsec *pc, int sep, size_t *last_out); static void printline_metadata(struct str *line, int sep); bool @@ -214,15 +214,29 @@ procmatch_match(struct mprintc *mc, struct parsec *pc) /* Print the matching line, but only if not quiet/binary */ if (mc->printmatch) { - printline(pc, ':'); + size_t last_out; + bool terminated; + + last_out = 0; + terminated = printline(pc, ':', &last_out); while (pc->matchidx >= MAX_MATCHES) { /* Reset matchidx and try again */ pc->matchidx = 0; if (procline(pc) == !vflag) - printline(pc, ':'); + terminated = printline(pc, ':', &last_out); else break; } + + /* + * The above loop processes the entire line as long as we keep + * hitting the maximum match count. At this point, we know + * that there's nothing left to be printed and can terminate the + * line. + */ + if (!terminated) + printline(pc, ':', &last_out); + first_match = false; mc->same_file = true; mc->last_outed = 0; @@ -748,26 +762,39 @@ printline_metadata(struct str *line, int sep) } /* - * Prints a matching line according to the command line options. + * Prints a matching line according to the command line options. We need + * *last_out to be populated on entry in case this is just a continuation of + * matches within the same line. + * + * Returns true if the line was terminated, false if it was not. */ -static void -printline(struct parsec *pc, int sep) +static bool +printline(struct parsec *pc, int sep, size_t *last_out) { - size_t a = 0; + size_t a = *last_out; size_t i, matchidx; regmatch_t match; + bool terminated; + + /* + * Nearly all paths below will terminate the line by default, but it is + * avoided in some circumstances in case we don't have the full context + * available here. + */ + terminated = true; /* If matchall, everything matches but don't actually print for -o */ if (oflag && matchall) - return; + return (terminated); matchidx = pc->matchidx; /* --color and -o */ - if ((oflag || color) && matchidx > 0) { + if ((oflag || color) && (pc->printed > 0 || matchidx > 0)) { /* Only print metadata once per line if --color */ - if (!oflag && pc->printed == 0) + if (!oflag && pc->printed == 0) { printline_metadata(&pc->ln, sep); + } for (i = 0; i < matchidx; i++) { match = pc->matches[i]; /* Don't output zero length matches */ @@ -780,9 +807,10 @@ printline(struct parsec *pc, int sep) if (oflag) { pc->ln.boff = match.rm_so; printline_metadata(&pc->ln, sep); - } else + } else { fwrite(pc->ln.dat + a, match.rm_so - a, 1, stdout); + } if (color) fprintf(stdout, "\33[%sm\33[K", color); fwrite(pc->ln.dat + match.rm_so, @@ -793,13 +821,31 @@ printline(struct parsec *pc, int sep) if (oflag) putchar('\n'); } - if (!oflag) { - if (pc->ln.len - a > 0) + + /* + * Don't terminate if we reached the match limit; we may have + * other matches on this line to process. + */ + *last_out = a; + if (!oflag && matchidx != MAX_MATCHES) { + if (pc->ln.len - a > 0) { fwrite(pc->ln.dat + a, pc->ln.len - a, 1, stdout); + *last_out = pc->ln.len; + } putchar('\n'); + } else if (!oflag) { + /* + * -o is terminated on every match output, so this + * branch is only designed to capture MAX_MATCHES in a + * line which may be a signal to us for a lack of + * context. The caller will know more and call us again + * to terminate if it needs to. + */ + terminated = false; } } else grep_printline(&pc->ln, sep); pc->printed++; + return (terminated); }