From owner-dev-commits-src-all@freebsd.org Sun Jan 24 03:05:08 2021 Return-Path: Delivered-To: dev-commits-src-all@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 0B9454E88B7; Sun, 24 Jan 2021 03:05:08 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4DNdCR5ZSLz3r15; Sun, 24 Jan 2021 03:05:07 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id B236E1C4FD; Sun, 24 Jan 2021 03:05:07 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 10O357hQ014705; Sun, 24 Jan 2021 03:05:07 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 10O357eR014704; Sun, 24 Jan 2021 03:05:07 GMT (envelope-from git) Date: Sun, 24 Jan 2021 03:05:07 GMT Message-Id: <202101240305.10O357eR014704@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: Kyle Evans Subject: git: 912086c27f9a - stable/12 - libc: regex: rework unsafe pointer arithmetic MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: kevans X-Git-Repository: src X-Git-Refname: refs/heads/stable/12 X-Git-Reftype: branch X-Git-Commit: 912086c27f9ab75253af8ae7914ae6001035a1b2 Auto-Submitted: auto-generated X-BeenThere: dev-commits-src-all@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Commit messages for all branches of the src repository List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Jan 2021 03:05:08 -0000 The branch stable/12 has been updated by kevans: URL: https://cgit.FreeBSD.org/src/commit/?id=912086c27f9ab75253af8ae7914ae6001035a1b2 commit 912086c27f9ab75253af8ae7914ae6001035a1b2 Author: Miod Vallat AuthorDate: 2021-01-08 18:59:00 +0000 Commit: Kyle Evans CommitDate: 2021-01-24 03:04:58 +0000 libc: regex: rework unsafe pointer arithmetic regcomp.c uses the "start + count < end" idiom to check that there are "count" bytes available in an array of char "start" and "end" both point to. This is fine, unless "start + count" goes beyond the last element of the array. In this case, pedantic interpretation of the C standard makes the comparison of such a pointer against "end" undefined, and optimizers from hell will happily remove as much code as possible because of this. An example of this occurs in regcomp.c's bothcases(), which defines bracket[3], sets "next" to "bracket" and "end" to "bracket + 2". Then it invokes p_bracket(), which starts with "if (p->next + 5 < p->end)"... Because bothcases() and p_bracket() are static functions in regcomp.c, there is a real risk of miscompilation if aggressive inlining happens. The following diff rewrites the "start + count < end" constructs into "end - start > count". Assuming "end" and "start" are always pointing in the array (such as "bracket[3]" above), "end - start" is well-defined and can be compared without trouble. As a bonus, MORE2() implies MORE() therefore SEETWO() can be simplified a bit. PR: 252403 (cherry picked from commit d36b5dbe28d8ebab219fa29db533734d47f0c4a3) --- lib/libc/regex/regcomp.c | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/lib/libc/regex/regcomp.c b/lib/libc/regex/regcomp.c index 00ab6a77141b..fc66ea32046a 100644 --- a/lib/libc/regex/regcomp.c +++ b/lib/libc/regex/regcomp.c @@ -177,10 +177,10 @@ static char nuls[10]; /* place to point scanner in event of error */ */ #define PEEK() (*p->next) #define PEEK2() (*(p->next+1)) -#define MORE() (p->next < p->end) -#define MORE2() (p->next+1 < p->end) +#define MORE() (p->end - p->next > 0) +#define MORE2() (p->end - p->next > 1) #define SEE(c) (MORE() && PEEK() == (c)) -#define SEETWO(a, b) (MORE() && MORE2() && PEEK() == (a) && PEEK2() == (b)) +#define SEETWO(a, b) (MORE2() && PEEK() == (a) && PEEK2() == (b)) #define SEESPEC(a) (p->bre ? SEETWO('\\', a) : SEE(a)) #define EAT(c) ((SEE(c)) ? (NEXT(), 1) : 0) #define EATTWO(a, b) ((SEETWO(a, b)) ? (NEXT2(), 1) : 0) @@ -997,15 +997,17 @@ p_bracket(struct parse *p) wint_t ch; /* Dept of Truly Sickening Special-Case Kludges */ - if (p->next + 5 < p->end && strncmp(p->next, "[:<:]]", 6) == 0) { - EMIT(OBOW, 0); - NEXTn(6); - return; - } - if (p->next + 5 < p->end && strncmp(p->next, "[:>:]]", 6) == 0) { - EMIT(OEOW, 0); - NEXTn(6); - return; + if (p->end - p->next > 5) { + if (strncmp(p->next, "[:<:]]", 6) == 0) { + EMIT(OBOW, 0); + NEXTn(6); + return; + } + if (strncmp(p->next, "[:>:]]", 6) == 0) { + EMIT(OEOW, 0); + NEXTn(6); + return; + } } if ((cs = allocset(p)) == NULL)