Date: Thu, 30 Jul 2020 11:53:32 +0800 From: Li-Wen Hsu <lwhsu@freebsd.org> To: Kyle Evans <kevans@freebsd.org> Cc: src-committers <src-committers@freebsd.org>, svn-src-all <svn-src-all@freebsd.org>, svn-src-head <svn-src-head@freebsd.org> Subject: Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex Message-ID: <CAKBkRUy%2BTvK6L2iRaixyPB6-OQCkHLWgo5QLiRJV1Qx9c-Md_w@mail.gmail.com> In-Reply-To: <202007292321.06TNLuoq087451@repo.freebsd.org> References: <202007292321.06TNLuoq087451@repo.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jul 30, 2020 at 7:22 AM Kyle Evans <kevans@freebsd.org> wrote: > > Author: kevans > Date: Wed Jul 29 23:21:56 2020 > New Revision: 363679 > URL: https://svnweb.freebsd.org/changeset/base/363679 > > Log: > regex(3): Interpret many escaped ordinary characters as EESCAPE > > In IEEE 1003.1-2008 [1] and earlier revisions, BRE/ERE grammar allows for > any character to be escaped, but "ORD_CHAR preceded by an unescaped > <backslash> character [gives undefined results]". > > Historically, we've interpreted an escaped ordinary character as the > ordinary character itself. This becomes problematic when some extensions > give special meanings to an otherwise ordinary character > (e.g. GNU's \b, \s, \w), meaning we may have two different valid > interpretations of the same sequence. > > To make this easier to deal with and given that the standard calls this > undefined, we should throw an error (EESCAPE) if we run into this scenario > to ease transition into a state where some escaped ordinaries are blessed > with a special meaning -- it will either error out or have extended > behavior, rather than have two entirely different versions of undefined > behavior that leave the consumer of regex(3) guessing as to what behavior > will be used or leaving them with false impressions. > > This change bumps the symbol version of regcomp to FBSD_1.6 and provides the > old escape semantics for legacy applications, just in case one has an older > application that would immediately turn into a pumpkin because of an > extraneous escape that's embehttps://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/dded or otherwise critical to its operation. > > This is the final piece needed before enhancing libregex with GNU extensions > and flipping the switch on bsdgrep. > > [1] http://pubs.opengroup.org/onlinepubs/9699919799.2016edition/ > > PR: 229925 (exp-run, courtesy of antoine) > Differential Revision: https://reviews.freebsd.org/D10510 > > Modified: > head/contrib/netbsd-tests/lib/libc/regex/data/meta.in > head/contrib/netbsd-tests/lib/libc/regex/data/subexp.in > head/lib/libc/regex/Symbol.map > head/lib/libc/regex/regcomp.c I think there are 3 test cases need to be modified after this change: https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/ https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.diff/diff_test/side_by_side/ https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.sed/sed2_test/hex_subst/ Please help to check them, thanks! Li-Wen
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAKBkRUy%2BTvK6L2iRaixyPB6-OQCkHLWgo5QLiRJV1Qx9c-Md_w>