Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 29 Jul 2020 23:07:12 -0500
From:      Kyle Evans <kevans@freebsd.org>
To:        Li-Wen Hsu <lwhsu@freebsd.org>
Cc:        Kyle Evans <kevans@freebsd.org>, src-committers <src-committers@freebsd.org>,  svn-src-all <svn-src-all@freebsd.org>, svn-src-head <svn-src-head@freebsd.org>
Subject:   Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex
Message-ID:  <CACNAnaFbSJXUPyy9j38of%2BLrvThS_ginxavrbqPc=C4y_Z6TPw@mail.gmail.com>
In-Reply-To: <CAKBkRUy%2BTvK6L2iRaixyPB6-OQCkHLWgo5QLiRJV1Qx9c-Md_w@mail.gmail.com>
References:  <202007292321.06TNLuoq087451@repo.freebsd.org> <CAKBkRUy%2BTvK6L2iRaixyPB6-OQCkHLWgo5QLiRJV1Qx9c-Md_w@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Sorry, on mobile, so doubling down on bad formatting by top-posting...

The sed/diff tests are easy to fix, will do those in about 8/9 hours.

The Google test failure is interesting- this expression has clearly been
wrong and getting the wrong results, so we've caught a legitimate issue
here. I think the best path forward for that one is to commit my libregex
extensions and link that baby up so that \w works.

Thanks,

Kyle Evans

On Wed, Jul 29, 2020, 22:53 Li-Wen Hsu <lwhsu@freebsd.org> wrote:

> On Thu, Jul 30, 2020 at 7:22 AM Kyle Evans <kevans@freebsd.org> wrote:
> >
> > Author: kevans
> > Date: Wed Jul 29 23:21:56 2020
> > New Revision: 363679
> > URL: https://svnweb.freebsd.org/changeset/base/363679
> >
> > Log:
> >   regex(3): Interpret many escaped ordinary characters as EESCAPE
> >
> >   In IEEE 1003.1-2008 [1] and earlier revisions, BRE/ERE grammar allows
> for
> >   any character to be escaped, but "ORD_CHAR preceded by an unescaped
> >   <backslash> character [gives undefined results]".
> >
> >   Historically, we've interpreted an escaped ordinary character as the
> >   ordinary character itself. This becomes problematic when some
> extensions
> >   give special meanings to an otherwise ordinary character
> >   (e.g. GNU's \b, \s, \w), meaning we may have two different valid
> >   interpretations of the same sequence.
> >
> >   To make this easier to deal with and given that the standard calls this
> >   undefined, we should throw an error (EESCAPE) if we run into this
> scenario
> >   to ease transition into a state where some escaped ordinaries are
> blessed
> >   with a special meaning -- it will either error out or have extended
> >   behavior, rather than have two entirely different versions of undefined
> >   behavior that leave the consumer of regex(3) guessing as to what
> behavior
> >   will be used or leaving them with false impressions.
> >
> >   This change bumps the symbol version of regcomp to FBSD_1.6 and
> provides the
> >   old escape semantics for legacy applications, just in case one has an
> older
> >   application that would immediately turn into a pumpkin because of an
> >   extraneous escape that's embehttps://
> ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/dded
> or otherwise critical to its operation.
> >
> >   This is the final piece needed before enhancing libregex with GNU
> extensions
> >   and flipping the switch on bsdgrep.
> >
> >   [1] http://pubs.opengroup.org/onlinepubs/9699919799.2016edition/
> >
> >   PR:           229925 (exp-run, courtesy of antoine)
> >   Differential Revision:        https://reviews.freebsd.org/D10510
> >
> > Modified:
> >   head/contrib/netbsd-tests/lib/libc/regex/data/meta.in
> >   head/contrib/netbsd-tests/lib/libc/regex/data/subexp.in
> >   head/lib/libc/regex/Symbol.map
> >   head/lib/libc/regex/regcomp.c
>
> I think there are 3 test cases need to be modified after this change:
>
>
> https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/
>
> https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.diff/diff_test/side_by_side/
>
> https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.sed/sed2_test/hex_subst/
>
> Please help to check them, thanks!
>
> Li-Wen
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CACNAnaFbSJXUPyy9j38of%2BLrvThS_ginxavrbqPc=C4y_Z6TPw>